On Neural Network Pruning’s Effect on Generalization
Practitioners frequently observe that pruning improves model generalization. A longstanding hypothesis attributes such improvement to model size reduction. However, recent studies on over-parameterization characterize a new model size regime, in which larger models achieve better generalization. A c...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2023
|
Online Access: | https://hdl.handle.net/1721.1/147496 |
_version_ | 1826210439116095488 |
---|---|
author | Jin, Tian |
author2 | Carbin, Michael |
author_facet | Carbin, Michael Jin, Tian |
author_sort | Jin, Tian |
collection | MIT |
description | Practitioners frequently observe that pruning improves model generalization. A longstanding hypothesis attributes such improvement to model size reduction. However, recent studies on over-parameterization characterize a new model size regime, in which larger models achieve better generalization. A contradiction arises when pruning is applied to over-parameterized models – while theory predicts that reducing size harms generalization, pruning nonetheless improves it. Motivated by such a contradiction, I re-examine pruning’s effect on generalization empirically.
I demonstrate that pruning’s generalization-improving effect cannot be fully accounted for by weight removal. Instead, I find that pruning can lead to better training, improving model training loss. I find that pruning can also lead to stronger regularization, mitigating the harmful effect of noisy examples. Pruning extends model training time and reduces model size, which improves training and strengthens regularization respectively. I empirically demonstrate that both factors are essential to explaining pruning’s benefits to generalization fully. |
first_indexed | 2024-09-23T14:49:56Z |
format | Thesis |
id | mit-1721.1/147496 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T14:49:56Z |
publishDate | 2023 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1474962023-01-20T03:15:28Z On Neural Network Pruning’s Effect on Generalization Jin, Tian Carbin, Michael Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Practitioners frequently observe that pruning improves model generalization. A longstanding hypothesis attributes such improvement to model size reduction. However, recent studies on over-parameterization characterize a new model size regime, in which larger models achieve better generalization. A contradiction arises when pruning is applied to over-parameterized models – while theory predicts that reducing size harms generalization, pruning nonetheless improves it. Motivated by such a contradiction, I re-examine pruning’s effect on generalization empirically. I demonstrate that pruning’s generalization-improving effect cannot be fully accounted for by weight removal. Instead, I find that pruning can lead to better training, improving model training loss. I find that pruning can also lead to stronger regularization, mitigating the harmful effect of noisy examples. Pruning extends model training time and reduces model size, which improves training and strengthens regularization respectively. I empirically demonstrate that both factors are essential to explaining pruning’s benefits to generalization fully. S.M. 2023-01-19T19:54:15Z 2023-01-19T19:54:15Z 2022-09 2022-10-19T18:57:25.582Z Thesis https://hdl.handle.net/1721.1/147496 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Jin, Tian On Neural Network Pruning’s Effect on Generalization |
title | On Neural Network Pruning’s Effect on Generalization |
title_full | On Neural Network Pruning’s Effect on Generalization |
title_fullStr | On Neural Network Pruning’s Effect on Generalization |
title_full_unstemmed | On Neural Network Pruning’s Effect on Generalization |
title_short | On Neural Network Pruning’s Effect on Generalization |
title_sort | on neural network pruning s effect on generalization |
url | https://hdl.handle.net/1721.1/147496 |
work_keys_str_mv | AT jintian onneuralnetworkpruningseffectongeneralization |