Predicting parameters in deep learning

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Predicting parameters in deep learning Shakibi, Babak

Abstract

The recent success of large and deep neural network models has motivated the training of even larger and deeper networks with millions of parameters. Training these models usually requires parallel training methods where communicating large number of parameters becomes one of the main bottlenecks. We show that many deep learning models are over-parameterized and their learned features can be predicted given only a small fraction of their parameters. We then propose a method which exploits this fact during the training to reduce the number of parameters that need to be learned. Our method is orthogonal to the choice of network architecture and can be applied in a wide variety of neural network architectures and application areas. We evaluate this technique using various experiments in image and speech recognition and show that we can only learn a fraction of the parameters (up to 10% in some cases) and predict the rest without a significant loss in the predictive accuracy of the model.

Item Metadata

Title	Predicting parameters in deep learning
Creator	Shakibi, Babak
Publisher	University of British Columbia
Date Issued	2014
Description	The recent success of large and deep neural network models has motivated the training of even larger and deeper networks with millions of parameters. Training these models usually requires parallel training methods where communicating large number of parameters becomes one of the main bottlenecks. We show that many deep learning models are over-parameterized and their learned features can be predicted given only a small fraction of their parameters. We then propose a method which exploits this fact during the training to reduce the number of parameters that need to be learned. Our method is orthogonal to the choice of network architecture and can be applied in a wide variety of neural network architectures and application areas. We evaluate this technique using various experiments in image and speech recognition and show that we can only learn a fraction of the parameters (up to 10% in some cases) and predict the rest without a significant loss in the predictive accuracy of the model.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2014-11-07
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivs 2.5 Canada
DOI	10.14288/1.0165555
URI	http://hdl.handle.net/2429/50999
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2014-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/2.5/ca/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Predicting parameters in deep learning Shakibi, Babak

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights