UBC Theses and Dissertations
On the computational asymptotics of Gaussian variational inference Xu, Zuheng
Variational inference is a popular alternative to Markov chain Monte Carlo methods that constructs a Bayesian posterior approximation by minimizing a discrepancy to the true posterior within a pre-specified family. This converts Bayesian inference into an optimization problem, enabling the use of simple and scalable stochas- tic optimization algorithms. However, a key limitation of variational inference is that the optimal approximation is typically not tractable to compute; even in simple settings the problem is nonconvex. Thus, recently developed statistical guarantees—which all involve the (data) asymptotic properties of the optimal varia- tional distribution—are not reliably obtained in practice. In this work, we provide two major contributions: a theoretical analysis of the asymptotic convexity prop- erties of variational inference in the popular setting with a Gaussian family; and consistent stochastic variational inference (CSVI), an algorithm that exploits these properties to find the optimal approximation in the asymptotic regime. CSVI con- sists of a tractable initialization procedure that finds the local basin of the optimal solution, and a scaled gradient descent algorithm that stays locally confined to that basin. Experiments on nonconvex synthetic examples show that compared with standard stochastic gradient descent, CSVI improves the likelihood of obtaining the globally optimal posterior approximation.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International