Predicting Parameters in Deep Learning
We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of t...
Main Authors: | , , , , |
---|---|
Format: | Conference item |
Published: |
2013
|
_version_ | 1797058700334071808 |
---|---|
author | Denil, M Shakibi, B Dinh, L Ranzato, M de Freitas, N |
author_facet | Denil, M Shakibi, B Dinh, L Ranzato, M de Freitas, N |
author_sort | Denil, M |
collection | OXFORD |
description | We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95% of the weights of a network without any drop in accuracy. |
first_indexed | 2024-03-06T19:54:00Z |
format | Conference item |
id | oxford-uuid:24eb6b3a-d833-4f13-93fb-12277843891b |
institution | University of Oxford |
last_indexed | 2024-03-06T19:54:00Z |
publishDate | 2013 |
record_format | dspace |
spelling | oxford-uuid:24eb6b3a-d833-4f13-93fb-12277843891b2022-03-26T11:52:55ZPredicting Parameters in Deep LearningConference itemhttp://purl.org/coar/resource_type/c_5794uuid:24eb6b3a-d833-4f13-93fb-12277843891bDepartment of Computer Science2013Denil, MShakibi, BDinh, LRanzato, Mde Freitas, NWe demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95% of the weights of a network without any drop in accuracy. |
spellingShingle | Denil, M Shakibi, B Dinh, L Ranzato, M de Freitas, N Predicting Parameters in Deep Learning |
title | Predicting Parameters in Deep Learning |
title_full | Predicting Parameters in Deep Learning |
title_fullStr | Predicting Parameters in Deep Learning |
title_full_unstemmed | Predicting Parameters in Deep Learning |
title_short | Predicting Parameters in Deep Learning |
title_sort | predicting parameters in deep learning |
work_keys_str_mv | AT denilm predictingparametersindeeplearning AT shakibib predictingparametersindeeplearning AT dinhl predictingparametersindeeplearning AT ranzatom predictingparametersindeeplearning AT defreitasn predictingparametersindeeplearning |