- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Stochastic second-order optimization for over-parameterized...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Stochastic second-order optimization for over-parameterized machine learning models Meng, Si Yi
Abstract
We consider stochastic second-order methods for minimizing smooth and strongly-convex functions under an interpolation condition, which can be satisfied by over-parameterized machine learning models. Under this condition, we show that the regularized subsampled Newton’s method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size. By growing the batch size for both the subsampled gradient and Hessian, we show that R-SSN can converge at a quadratic rate in a local neighbourhood of the solution. We also show that R-SSN attains local linear convergence for the family of self-concordant functions. Furthermore, we analyze stochastic BFGS algorithms in the interpolation setting and prove their global linear convergence. We empirically evaluate stochastic limited-memory BFGS (L-BFGS) and a “Hessian-free” implementation of R-SSN for binary classification on synthetic, linearly-separable datasets and real datasets under a kernel mapping. Our experimental results demonstrate the fast convergence of these methods, both in terms of the number of iterations and wall-clock time.
Item Metadata
Title |
Stochastic second-order optimization for over-parameterized machine learning models
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2020
|
Description |
We consider stochastic second-order methods for minimizing smooth and strongly-convex functions under an interpolation condition, which can be satisfied by over-parameterized machine learning models. Under this condition, we show that the regularized subsampled Newton’s method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size. By growing the batch size for both the subsampled gradient and Hessian, we show that R-SSN can converge at a quadratic rate in a local neighbourhood of the solution. We also show that R-SSN attains local linear convergence for the family of self-concordant functions. Furthermore, we analyze stochastic BFGS algorithms in the interpolation setting and prove their global linear convergence. We empirically evaluate stochastic limited-memory BFGS (L-BFGS) and a “Hessian-free” implementation of R-SSN for binary classification on synthetic, linearly-separable datasets and real datasets under a kernel mapping. Our experimental results demonstrate the fast convergence of these methods, both in terms of the number of iterations and wall-clock time.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2020-08-31
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0394117
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2020-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International