On Generalization Bounds for Neural Networks with Low Rank Layers

While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain under-explored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how lo...

Full description

Bibliographic Details
Main Authors: Pinto, Andrea, Rangamani, Akshay, Poggio, Tomaso
Format: Article
Published: Center for Brains, Minds and Machines (CBMM) 2024
Subjects:
Online Access:https://hdl.handle.net/1721.1/157263
_version_ 1824458166545416192
author Pinto, Andrea
Rangamani, Akshay
Poggio, Tomaso
author_facet Pinto, Andrea
Rangamani, Akshay
Poggio, Tomaso
author_sort Pinto, Andrea
collection MIT
description While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain under-explored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors that typically multiply across layers. This approach yields generalization bounds for rank and spectral norm constrained networks. We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers. Additionally, we discuss how this framework provides new perspectives on the generalization capabilities of deep nets exhibiting neural collapse.
first_indexed 2025-02-19T04:21:34Z
format Article
id mit-1721.1/157263
institution Massachusetts Institute of Technology
last_indexed 2025-02-19T04:21:34Z
publishDate 2024
publisher Center for Brains, Minds and Machines (CBMM)
record_format dspace
spelling mit-1721.1/1572632024-10-12T03:35:04Z On Generalization Bounds for Neural Networks with Low Rank Layers Pinto, Andrea Rangamani, Akshay Poggio, Tomaso Gaussian Complexity, Generalization Bounds, Low Rank Layers, Neural Collapse While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain under-explored. In this paper, we apply a chain rule for Gaussian complexity (Maurer, 2016a) to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors that typically multiply across layers. This approach yields generalization bounds for rank and spectral norm constrained networks. We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers. Additionally, we discuss how this framework provides new perspectives on the generalization capabilities of deep nets exhibiting neural collapse. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216. 2024-10-11T13:51:12Z 2024-10-11T13:51:12Z 2024-10-11 Article Technical Report Working Paper https://hdl.handle.net/1721.1/157263 CBMM Memo;151 application/pdf Center for Brains, Minds and Machines (CBMM)
spellingShingle Gaussian Complexity, Generalization Bounds, Low Rank Layers, Neural Collapse
Pinto, Andrea
Rangamani, Akshay
Poggio, Tomaso
On Generalization Bounds for Neural Networks with Low Rank Layers
title On Generalization Bounds for Neural Networks with Low Rank Layers
title_full On Generalization Bounds for Neural Networks with Low Rank Layers
title_fullStr On Generalization Bounds for Neural Networks with Low Rank Layers
title_full_unstemmed On Generalization Bounds for Neural Networks with Low Rank Layers
title_short On Generalization Bounds for Neural Networks with Low Rank Layers
title_sort on generalization bounds for neural networks with low rank layers
topic Gaussian Complexity, Generalization Bounds, Low Rank Layers, Neural Collapse
url https://hdl.handle.net/1721.1/157263
work_keys_str_mv AT pintoandrea ongeneralizationboundsforneuralnetworkswithlowranklayers
AT rangamaniakshay ongeneralizationboundsforneuralnetworkswithlowranklayers
AT poggiotomaso ongeneralizationboundsforneuralnetworkswithlowranklayers