On Neural Network Pruning’s Effect on Generalization

Practitioners frequently observe that pruning improves model generalization. A longstanding hypothesis attributes such improvement to model size reduction. However, recent studies on over-parameterization characterize a new model size regime, in which larger models achieve better generalization. A c...

Full description

Bibliographic Details
Main Author: Jin, Tian
Other Authors: Carbin, Michael
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/147496
_version_ 1826210439116095488
author Jin, Tian
author2 Carbin, Michael
author_facet Carbin, Michael
Jin, Tian
author_sort Jin, Tian
collection MIT
description Practitioners frequently observe that pruning improves model generalization. A longstanding hypothesis attributes such improvement to model size reduction. However, recent studies on over-parameterization characterize a new model size regime, in which larger models achieve better generalization. A contradiction arises when pruning is applied to over-parameterized models – while theory predicts that reducing size harms generalization, pruning nonetheless improves it. Motivated by such a contradiction, I re-examine pruning’s effect on generalization empirically. I demonstrate that pruning’s generalization-improving effect cannot be fully accounted for by weight removal. Instead, I find that pruning can lead to better training, improving model training loss. I find that pruning can also lead to stronger regularization, mitigating the harmful effect of noisy examples. Pruning extends model training time and reduces model size, which improves training and strengthens regularization respectively. I empirically demonstrate that both factors are essential to explaining pruning’s benefits to generalization fully.
first_indexed 2024-09-23T14:49:56Z
format Thesis
id mit-1721.1/147496
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T14:49:56Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1474962023-01-20T03:15:28Z On Neural Network Pruning’s Effect on Generalization Jin, Tian Carbin, Michael Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Practitioners frequently observe that pruning improves model generalization. A longstanding hypothesis attributes such improvement to model size reduction. However, recent studies on over-parameterization characterize a new model size regime, in which larger models achieve better generalization. A contradiction arises when pruning is applied to over-parameterized models – while theory predicts that reducing size harms generalization, pruning nonetheless improves it. Motivated by such a contradiction, I re-examine pruning’s effect on generalization empirically. I demonstrate that pruning’s generalization-improving effect cannot be fully accounted for by weight removal. Instead, I find that pruning can lead to better training, improving model training loss. I find that pruning can also lead to stronger regularization, mitigating the harmful effect of noisy examples. Pruning extends model training time and reduces model size, which improves training and strengthens regularization respectively. I empirically demonstrate that both factors are essential to explaining pruning’s benefits to generalization fully. S.M. 2023-01-19T19:54:15Z 2023-01-19T19:54:15Z 2022-09 2022-10-19T18:57:25.582Z Thesis https://hdl.handle.net/1721.1/147496 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Jin, Tian
On Neural Network Pruning’s Effect on Generalization
title On Neural Network Pruning’s Effect on Generalization
title_full On Neural Network Pruning’s Effect on Generalization
title_fullStr On Neural Network Pruning’s Effect on Generalization
title_full_unstemmed On Neural Network Pruning’s Effect on Generalization
title_short On Neural Network Pruning’s Effect on Generalization
title_sort on neural network pruning s effect on generalization
url https://hdl.handle.net/1721.1/147496
work_keys_str_mv AT jintian onneuralnetworkpruningseffectongeneralization