- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Practical optimization methods for machine learning...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Practical optimization methods for machine learning models Babanezhad Harikandeh, Reza
Abstract
This work considers optimization methods for large-scale machine learning (ML). Optimization in ML is a crucial ingredient in the training stage of ML models. Optimization methods in this setting need to have cheap iteration cost. First-order methods are known to have reasonably low iteration costs. A notable recent class of stochastic first-order methods leverage variance reduction techniques to improve their convergence speed. This group includes stochastic average gradient (SAG), stochastic variance reduced gradient (SVRG), and stochastic average gradient am´elior´e (SAGA). The SAG and SAGA approach to variance reduction use additional memory in their algorithm. SVRG, on the other hand, does not need additional memory but requires occasional full-gradient evaluation. We first introduce variants of SVRG that require fewer gradient evaluations. We then present the first linearly convergent stochastic gradient method to train conditional random fields (CRFs) using SAG. Our method addresses the memory issues required for SAG and proposes a better non-uniform sampling (NUS) technique. The third part of this work extends the applicability of SAGA to Riemannian manifolds. We modify SAGA with operations existing in the manifold to improve the convergence speed of SAGA in these new spaces. Finally, we consider the convergence of classic stochastic gradient methods, based on mirror descent (MD), in non-convex setting. We analyse the MD with more general divergence function and show its application for variational inference models.
Item Metadata
Title |
Practical optimization methods for machine learning models
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2019
|
Description |
This work considers optimization methods for large-scale machine learning (ML).
Optimization in ML is a crucial ingredient in the training stage of ML models.
Optimization methods in this setting need to have cheap iteration cost. First-order
methods are known to have reasonably low iteration costs. A notable recent class
of stochastic first-order methods leverage variance reduction techniques to improve
their convergence speed. This group includes stochastic average gradient
(SAG), stochastic variance reduced gradient (SVRG), and stochastic average gradient
am´elior´e (SAGA). The SAG and SAGA approach to variance reduction use
additional memory in their algorithm. SVRG, on the other hand, does not need
additional memory but requires occasional full-gradient evaluation. We first introduce
variants of SVRG that require fewer gradient evaluations. We then present
the first linearly convergent stochastic gradient method to train conditional random
fields (CRFs) using SAG. Our method addresses the memory issues required
for SAG and proposes a better non-uniform sampling (NUS) technique. The third
part of this work extends the applicability of SAGA to Riemannian manifolds.
We modify SAGA with operations existing in the manifold to improve the convergence
speed of SAGA in these new spaces. Finally, we consider the convergence of
classic stochastic gradient methods, based on mirror descent (MD), in non-convex
setting. We analyse the MD with more general divergence function and show its
application for variational inference models.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2019-12-18
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0387209
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2020-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NoDerivatives 4.0 International