Open Collections

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Faster convolutional neural network training via approximate memoization Sandhupatla, Amruth

Abstract

Deep Convolutional Neural Networks have found wide application but their training time can be significant. We find that between successive epochs during training, many neurons compute nearly the same output when presented with the same input. This presents an opportunity to skip computation in the forward pass on the later epoch via memoization. This dissertation explores the potential of such an approach by investigating the correlation of neuron activations between training epochs. We develop an implementation of activation memoization that takes into account the lockstep behavior of threads executing together in single-instruction, multiple-thread Graphic Processing Units (GPU). Finally, we discuss the trade-off between speedup and accuracy by showing that STAN achieves a 1.3-4X convolution speedup over our baseline GPU implementation at the expense of 2.7-7% loss in accuracy. When STAN is applied to Hyperband which can be robust to drop in accuracy, an overall training speedup of 7%-16% can be achieved with a minimal change in test error (-0.2 to +0.2).

Item Metadata

Title	Faster convolutional neural network training via approximate memoization
Creator	Sandhupatla, Amruth
Publisher	University of British Columbia
Date Issued	2018
Description	Deep Convolutional Neural Networks have found wide application but their training time can be significant. We find that between successive epochs during training, many neurons compute nearly the same output when presented with the same input. This presents an opportunity to skip computation in the forward pass on the later epoch via memoization. This dissertation explores the potential of such an approach by investigating the correlation of neuron activations between training epochs. We develop an implementation of activation memoization that takes into account the lockstep behavior of threads executing together in single-instruction, multiple-thread Graphic Processing Units (GPU). Finally, we discuss the trade-off between speedup and accuracy by showing that STAN achieves a 1.3-4X convolution speedup over our baseline GPU implementation at the expense of 2.7-7% loss in accuracy. When STAN is applied to Hyperband which can be robust to drop in accuracy, an overall training speedup of 7%-16% can be achieved with a minimal change in test error (-0.2 to +0.2).
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2018-10-18
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0372887
URI	http://hdl.handle.net/2429/67615
Degree	Master of Applied Science - MASc
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2018-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Faster convolutional neural network training via approximate memoization Sandhupatla, Amruth

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights