- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Faster convolutional neural network training via approximate...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Faster convolutional neural network training via approximate memoization Sandhupatla, Amruth
Abstract
Deep Convolutional Neural Networks have found wide application but their training time can be significant. We find that between successive epochs during training, many neurons compute nearly the same output when presented with the same input. This presents an opportunity to skip computation in the forward pass on the later epoch via memoization. This dissertation explores the potential of such an approach by investigating the correlation of neuron activations between training epochs. We develop an implementation of activation memoization that takes into account the lockstep behavior of threads executing together in single-instruction, multiple-thread Graphic Processing Units (GPU). Finally, we discuss the trade-off between speedup and accuracy by showing that STAN achieves a 1.3-4X convolution speedup over our baseline GPU implementation at the expense of 2.7-7% loss in accuracy. When STAN is applied to Hyperband which can be robust to drop in accuracy, an overall training speedup of 7%-16% can be achieved with a minimal change in test error (-0.2 to +0.2).
Item Metadata
Title |
Faster convolutional neural network training via approximate memoization
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2018
|
Description |
Deep Convolutional Neural Networks have found wide application but their training
time can be significant. We find that between successive epochs during training,
many neurons compute nearly the same output when presented with the same input.
This presents an opportunity to skip computation in the forward pass on the
later epoch via memoization. This dissertation explores the potential of such an
approach by investigating the correlation of neuron activations between training
epochs. We develop an implementation of activation memoization that takes into
account the lockstep behavior of threads executing together in single-instruction,
multiple-thread Graphic Processing Units (GPU). Finally, we discuss the trade-off
between speedup and accuracy by showing that STAN achieves a 1.3-4X convolution
speedup over our baseline GPU implementation at the expense of 2.7-7%
loss in accuracy. When STAN is applied to Hyperband which can be robust
to drop in accuracy, an overall training speedup of 7%-16% can be achieved with
a minimal change in test error (-0.2 to +0.2).
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2018-10-18
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0372887
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2018-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International