- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Predicting parameters in deep learning
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Predicting parameters in deep learning Shakibi, Babak
Abstract
The recent success of large and deep neural network models has motivated the training of even larger and deeper networks with millions of parameters. Training these models usually requires parallel training methods where communicating large number of parameters becomes one of the main bottlenecks. We show that many deep learning models are over-parameterized and their learned features can be predicted given only a small fraction of their parameters. We then propose a method which exploits this fact during the training to reduce the number of parameters that need to be learned. Our method is orthogonal to the choice of network architecture and can be applied in a wide variety of neural network architectures and application areas. We evaluate this technique using various experiments in image and speech recognition and show that we can only learn a fraction of the parameters (up to 10% in some cases) and predict the rest without a significant loss in the predictive accuracy of the model.
Item Metadata
Title |
Predicting parameters in deep learning
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2014
|
Description |
The recent success of large and deep neural network models has motivated the training of even larger and deeper networks with millions of parameters. Training these models usually requires parallel training methods where communicating large number of parameters becomes one of the main bottlenecks. We show that many deep learning models are over-parameterized and their learned features can be predicted given only a small fraction of their parameters. We then propose a method which exploits this fact during the training to reduce the number of parameters that need to be learned. Our method is orthogonal to the choice of network architecture and can be applied in a wide variety of neural network architectures and application areas. We evaluate this technique using various experiments in image and speech recognition and show that we can only learn a fraction of the parameters (up to 10% in some cases) and predict the rest without a significant loss in the predictive accuracy of the model.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2014-11-07
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivs 2.5 Canada
|
DOI |
10.14288/1.0165555
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2014-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivs 2.5 Canada