--- date: "2020-01-17" author: Moritz Boos tags: - snippets - python - machine-learning --- # Probability density fitting of a Mixture of Gaussians via autograd Recently I've had to fit a [Mixture of Gaussians](https://en.wikipedia.org/wiki/Mixture_model#Gaussian_mixture_model) to a target *density* instead of individual samples drawn from this density. Googling revealed that at least [one other person](https://stats.stackexchange.com/questions/226504/fit-gaussian-mixture-model-directly-to-the-mixture-density) faced this particular problem too, but there was no code readily available. To be clear, the problem is the following: given a mixture of Gaussian probability density that is evaluated at $N$ points, we want to recover parameters of these Gaussians (i.e. mean $\mu_{i}$, standard deviation $\sigma_{i}$, and a set of mixture weights $\pi_{i}$ that are constrained to be [0, 1] and sum to 1). For three Gaussians our objective is to find parameters of a Gaussian mixture that make its density match the blue line below (evaluated at $N=25$ points). ```python from scipy import stats import autograd.numpy as np import matplotlib.pyplot as plt %matplotlib inline true_means = [5, 15, 20] true_stds = [2, 1, 2] true_pis = [0.3, 0.5, 0.2] gridmin, gridmax, N = 0, 25, 75 grid = np.arange(gridmin, gridmax, (gridmax-gridmin)/N) data = np.array([ np.sum([pi*stats.norm.pdf(t, loc=m, scale=std) for m, std, pi in zip(true_means, true_stds, true_pis)]) for t in grid]) plt.plot(data, label='Mixture probability density') plt.xlabel('Grid points') plt.ylabel('Density') ``` ![](/images/GMM_autograd_1_1.png) What follows is my simple solution using autograd with the following caveats: * I estimate the distance between target probability density and our density via an $L2$ distance * Placing a prior on individual parameters will help in converging to a sensible solution (i.e. prevent standard deviation from exploding) * Initial parameters matter, it can be necessary to run the procedure multiple times and choose the solution with smallest loss ## The solution ```python import autograd.numpy as np from autograd import grad from autograd import hessian_vector_product from autograd.misc.flatten import flatten_func from functools import partial # we have to defince our own Gaussian pdf because autograd does not like the one provided by scipy def pdf(x, m, std): return (1/(std*np.sqrt(2*np.pi)))*np.exp(-0.5*((x-m)/std)**2) def gmm(t, means, stds, pis): '''Returns the density of a mixture of Gaussians model estimated at grid point t. means is a list of means of the gaussians stds are logarithms of the standard deviation of the gaussians pis is the mixing probability''' density = np.sum([pi*pdf(t, mean, np.exp(std)) for mean, std, pi in zip(means, stds, pis)]) return density def loss_func(target_density, nested_params, grid=25): '''Loss function for the model, target_density is an array of values for the target density, params is a list that contains m1, m2, std1, std2 and the logit of pi, grid can either be an int (in which case it is the argument supplied to range) or an iterable''' means, stds, pis = nested_params if type(grid)==int: grid = range(grid) # go from logits to probabilities pis = [np.exp(pi) for pi in pis] pis = [pi/np.sum(pis) for pi in pis] our_density = np.array([gmm(t, means, stds, pis) for t in grid]) return np.sum((target_density-our_density)**2) # use partial to build a loss function that uses our data and grid structure objective = partial(loss_func, data, grid=grid) # get a random guess for the mean randmean = np.random.randint(0, 25, 3).astype('float').tolist() initial_params = [randmean, [0., 0., 0.], [5., 5., 5.]] # flatten everything so it plays nice with the optimizer flattened_obj, unflatten, flattened_init_params =\ flatten_func(objective, initial_params) # get jacobian gradf = grad(flattened_obj) # now use conjugate gradient descent from scipy.optimize import minimize res=minimize(flattened_obj, flattened_init_params, jac=gradf, method='BFGS') means, stds, pis = unflatten(res.x) pis = [np.exp(pi) for pi in pis] pis = [pi/np.sum(pis) for pi in pis] print('True parameters {}'.format(true_means+true_stds+true_pis)) print('Recovered parameters {}'.format(means+[np.exp(std) for std in stds]+pis)) ``` True parameters [5, 15, 20, 2, 1, 2, 0.3, 0.5, 0.2] Recovered parameters [array(5.00019487), array(20.0016879), array(15.00013811), 2.000017569259828, 1.99839559825986, 1.0001623730291136, 0.2999967947038737, 0.1999050817841846, 0.5000981235119417] ### Let's see if the results look right. ```python our_density = np.array([gmm(t, means, stds, pis) for t in grid]) plt.plot(our_density, label='Estimated probability density', linewidth=3) plt.plot(data, 'r-.', label='Mixture probability density') plt.xlabel('Grid points') plt.ylabel('Density') plt.legend() ``` ![](/images/GMM_autograd_6_1.png) You can download this notebook [here](http://mjboos.github.io/files/GMM_autograd.ipynb).