# Gaussian Mixture Model: ValueError: pvals < 0, pvals > 1 or pvals contains NaNs

I’m struggling to sample from a Gaussian Mixture Model. I have a very simple example where there’s actually only one component (so, not actually a mixture). Then I fit it using standard normal data. However, the mixture’s weights end up being greater than 1 for the one mixture, causing an error:

```import numpy as np
from sklearn.mixture import GaussianMixture

dataset = np.random.standard_normal(10).reshape(-1, 1)
mixture = GaussianMixture(n_components=1)
mixture.fit(dataset)
mixture.sample(10)
```
```ValueError: pvals < 0, pvals > 1 or pvals contains NaNs
```

It’s evident to me that this is caused by the weights of the first component being greater than 1:

```> print(mixture.weights_)
1.0000000000000002
```

This kind of seems like a bug. But maybe I’m doing something wrong here?

## Answer

Although technically this seems to be a bug indeed, truth is that, as already explained in the other answer, the real issue stems from the fact that asking for a Gaussian Mixture with `n_components=1` does not make sense from a modelling perspective; one could argue that an exception (or at least a warning) should be caused earlier, i.e. whenever a `GaussianMixture(n_components=1)` is requested. I guess it may be a design choice not to do so, but in any case this is arguably something to be discussed in the scikit-learn Github repo as a possible issue, and not here.

That said, a workaround here is pretty straighforward: in the special case when `n_components=1`, force `mixture.weights_` to be equal to 1.0:

```import numpy as np
from sklearn.mixture import GaussianMixture

dataset = np.random.standard_normal(10).reshape(-1, 1)
mixture = GaussianMixture(n_components=1)
mixture.fit(dataset)

mixture.weights_
# 1.0000000000000002

mixture.sample(10)
# ValueError: pvals < 0, pvals > 1 or pvals contains NaNs

# force weight to 1.0:
mixture.weights_ = 1.

mixture.sample(10)
# result:
(array([[ 0.51371178],
[ 0.1530927 ],
[-0.56327362],
[-1.22308348],
[ 1.26889771],
[ 1.11849849],
[-1.47091749],
[-0.41259178],
[ 1.93872769],
[ 0.26282224]]), array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))
```

Apparently, there should not be any theoretical concerns here, since by definition the weight of a single component in a Gaussian mixture is 1.0; it is just that, as demonstrated in the other answer, in the limit of a low number of available samples, the GMM algorithm fails to give a weight of exactly 1.0 within the available machine precision.