Variational Bayesian Methods - Mathematical Derivation of The Mean-field Approximation | Mathematical Derivation Mean-field Approximation

Mathematical Derivation of The Mean-field Approximation

In variational inference, the posterior distribution over a set of unobserved variables given some data is approximated by a variational distribution, :

The distribution is restricted to belong to a family of distributions of simpler form than, selected with the intention of making similar to the true posterior, . The lack of similarity is measured in terms of a dissimilarity function and hence inference is performed by selecting the distribution that minimizes .

The most common type of variational Bayes, known as mean-field variational Bayes, uses the Kullback–Leibler divergence (KL-divergence) of P from Q as the choice of dissimilarity function. This choice makes this minimization tractable. The KL-divergence is defined as

Note that Q and P are reversed from what one might expect. This use of reversed KL-divergence is conceptually similar to the expectation-maximization algorithm. (Using the KL-divergence in the other way produces the expectation propagation algorithm.)

The KL-divergence can be written as

$\begin{align} \log P(\mathbf{X}) & = D_{\mathrm{KL}}(Q||P) - \sum_\mathbf{Z} Q(\mathbf{Z}) \log \frac{Q(\mathbf{Z})}{P(\mathbf{Z},\mathbf{X})} \\ & = D_{\mathrm{KL}}(Q||P) + \mathcal{L}(Q). \end{align}$

As the log evidence is fixed with respect to, maximising the final term minimizes the KL divergence of from . By appropriate choice of, becomes tractable to compute and to maximize. Hence we have both an analytical approximation for the posterior, and a lower bound for the evidence . The lower bound is known as the (negative) variational free energy because it can also be expressed as an "energy" plus the entropy of .

Read more about this topic: Variational Bayesian Methods

Famous quotes containing the word mathematical:

“What he loved so much in the plant morphological structure of the tree was that given a fixed mathematical basis, the final evolution was so incalculable.”
—D.H. (David Herbert)

Related Phrases

Dirichlet Distribution

Higher Moments

Monte Carlo

Multinomial Distribution

Multivariate Gaussian Distribution

Parameters Compute

Posterior Distributions

Prior Distributions

Single Observation

Unobserved Variables

Related Words