On Generalization Bounds for Neural Networks with Low Rank Layers
While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain under-explored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how lo...
Main Authors: | , , |
---|---|
Format: | Article |
Published: |
Center for Brains, Minds and Machines (CBMM)
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/1721.1/157263 |
_version_ | 1824458166545416192 |
---|---|
author | Pinto, Andrea Rangamani, Akshay Poggio, Tomaso |
author_facet | Pinto, Andrea Rangamani, Akshay Poggio, Tomaso |
author_sort | Pinto, Andrea |
collection | MIT |
description | While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain under-explored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors that typically multiply across layers. This approach yields generalization bounds for rank and spectral norm constrained networks. We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers. Additionally, we discuss how this framework provides new perspectives on the generalization capabilities of deep nets exhibiting neural collapse. |
first_indexed | 2025-02-19T04:21:34Z |
format | Article |
id | mit-1721.1/157263 |
institution | Massachusetts Institute of Technology |
last_indexed | 2025-02-19T04:21:34Z |
publishDate | 2024 |
publisher | Center for Brains, Minds and Machines (CBMM) |
record_format | dspace |
spelling | mit-1721.1/1572632024-10-12T03:35:04Z On Generalization Bounds for Neural Networks with Low Rank Layers Pinto, Andrea Rangamani, Akshay Poggio, Tomaso Gaussian Complexity, Generalization Bounds, Low Rank Layers, Neural Collapse While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain under-explored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors that typically multiply across layers. This approach yields generalization bounds for rank and spectral norm constrained networks. We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers. Additionally, we discuss how this framework provides new perspectives on the generalization capabilities of deep nets exhibiting neural collapse. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216. 2024-10-11T13:51:12Z 2024-10-11T13:51:12Z 2024-10-11 Article Technical Report Working Paper https://hdl.handle.net/1721.1/157263 CBMM Memo;151 application/pdf Center for Brains, Minds and Machines (CBMM) |
spellingShingle | Gaussian Complexity, Generalization Bounds, Low Rank Layers, Neural Collapse Pinto, Andrea Rangamani, Akshay Poggio, Tomaso On Generalization Bounds for Neural Networks with Low Rank Layers |
title | On Generalization Bounds for Neural Networks with Low Rank Layers |
title_full | On Generalization Bounds for Neural Networks with Low Rank Layers |
title_fullStr | On Generalization Bounds for Neural Networks with Low Rank Layers |
title_full_unstemmed | On Generalization Bounds for Neural Networks with Low Rank Layers |
title_short | On Generalization Bounds for Neural Networks with Low Rank Layers |
title_sort | on generalization bounds for neural networks with low rank layers |
topic | Gaussian Complexity, Generalization Bounds, Low Rank Layers, Neural Collapse |
url | https://hdl.handle.net/1721.1/157263 |
work_keys_str_mv | AT pintoandrea ongeneralizationboundsforneuralnetworkswithlowranklayers AT rangamaniakshay ongeneralizationboundsforneuralnetworkswithlowranklayers AT poggiotomaso ongeneralizationboundsforneuralnetworkswithlowranklayers |