UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Practical optimization methods for machine learning models Babanezhad Harikandeh, Reza

Abstract

This work considers optimization methods for large-scale machine learning (ML). Optimization in ML is a crucial ingredient in the training stage of ML models. Optimization methods in this setting need to have cheap iteration cost. First-order methods are known to have reasonably low iteration costs. A notable recent class of stochastic first-order methods leverage variance reduction techniques to improve their convergence speed. This group includes stochastic average gradient (SAG), stochastic variance reduced gradient (SVRG), and stochastic average gradient am´elior´e (SAGA). The SAG and SAGA approach to variance reduction use additional memory in their algorithm. SVRG, on the other hand, does not need additional memory but requires occasional full-gradient evaluation. We first introduce variants of SVRG that require fewer gradient evaluations. We then present the first linearly convergent stochastic gradient method to train conditional random fields (CRFs) using SAG. Our method addresses the memory issues required for SAG and proposes a better non-uniform sampling (NUS) technique. The third part of this work extends the applicability of SAGA to Riemannian manifolds. We modify SAGA with operations existing in the manifold to improve the convergence speed of SAGA in these new spaces. Finally, we consider the convergence of classic stochastic gradient methods, based on mirror descent (MD), in non-convex setting. We analyse the MD with more general divergence function and show its application for variational inference models.

Item Citations and Data

Rights

Attribution-NoDerivatives 4.0 International