On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

It is well-known that a neural network learning process—along with its connections to fitting, compression, and generalization—is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the...

Full description

Bibliographic Details
Main Authors:	Zhaoyan Lyu, Gholamali Aminian, Miguel R. D. Rodrigues
Format:	Article
Language:	English
Published:	MDPI AG 2023-07-01
Series:	Entropy
Subjects:	deep learning information theory information bottleneck generalization fitting compression
Online Access:	https://www.mdpi.com/1099-4300/25/7/1063

_version_	1797589362872942592
author	Zhaoyan Lyu Gholamali Aminian Miguel R. D. Rodrigues
author_facet	Zhaoyan Lyu Gholamali Aminian Miguel R. D. Rodrigues
author_sort	Zhaoyan Lyu
collection	DOAJ
description	It is well-known that a neural network learning process—along with its connections to fitting, compression, and generalization—is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the replacement of mutual information measures (which are notoriously difficult to estimate in high-dimensional spaces) by other more tractable ones, including (1) the minimum mean-squared error associated with the reconstruction of the network input data from some intermediate network representation and (2) the cross-entropy associated with a certain class label given some network representation. We then conducted an empirical study in order to ascertain how different network models, network learning algorithms, and datasets may affect the learning dynamics. Our experiments show that our proposed approach appears to be more reliable in comparison with classical information bottleneck ones in capturing network dynamics during both the training and testing phases. Our experiments also reveal that the fitting and compression phases exist regardless of the choice of activation function. Additionally, our findings suggest that model architectures, training algorithms, and datasets that lead to better generalization tend to exhibit more pronounced fitting and compression phases.
first_indexed	2024-03-11T01:05:33Z
format	Article
id	doaj.art-4593548acc1f4525b2c3fe9e495b68b2
institution	Directory Open Access Journal
issn	1099-4300
language	English
last_indexed	2024-03-11T01:05:33Z
publishDate	2023-07-01
publisher	MDPI AG
record_format	Article
series	Entropy
spelling	doaj.art-4593548acc1f4525b2c3fe9e495b68b22023-11-18T19:14:13ZengMDPI AGEntropy1099-43002023-07-01257106310.3390/e25071063On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like ApproachesZhaoyan Lyu0Gholamali Aminian1Miguel R. D. Rodrigues2Department of Electronic and Electrical Engineering, University College London, Gower St., London WC1E 6BT, UKThe Alan Turing Institute, British Library, 96 Euston Rd., London NW1 2DB, UKDepartment of Electronic and Electrical Engineering, University College London, Gower St., London WC1E 6BT, UKIt is well-known that a neural network learning process—along with its connections to fitting, compression, and generalization—is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the replacement of mutual information measures (which are notoriously difficult to estimate in high-dimensional spaces) by other more tractable ones, including (1) the minimum mean-squared error associated with the reconstruction of the network input data from some intermediate network representation and (2) the cross-entropy associated with a certain class label given some network representation. We then conducted an empirical study in order to ascertain how different network models, network learning algorithms, and datasets may affect the learning dynamics. Our experiments show that our proposed approach appears to be more reliable in comparison with classical information bottleneck ones in capturing network dynamics during both the training and testing phases. Our experiments also reveal that the fitting and compression phases exist regardless of the choice of activation function. Additionally, our findings suggest that model architectures, training algorithms, and datasets that lead to better generalization tend to exhibit more pronounced fitting and compression phases.https://www.mdpi.com/1099-4300/25/7/1063deep learninginformation theoryinformation bottleneckgeneralizationfittingcompression
spellingShingle	Zhaoyan Lyu Gholamali Aminian Miguel R. D. Rodrigues On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches Entropy deep learning information theory information bottleneck generalization fitting compression
title	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_full	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_fullStr	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_full_unstemmed	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_short	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
title_sort	on neural networks fitting compression and generalization behavior via information bottleneck like approaches
topic	deep learning information theory information bottleneck generalization fitting compression
url	https://www.mdpi.com/1099-4300/25/7/1063
work_keys_str_mv	AT zhaoyanlyu onneuralnetworksfittingcompressionandgeneralizationbehaviorviainformationbottlenecklikeapproaches AT gholamaliaminian onneuralnetworksfittingcompressionandgeneralizationbehaviorviainformationbottlenecklikeapproaches AT miguelrdrodrigues onneuralnetworksfittingcompressionandgeneralizationbehaviorviainformationbottlenecklikeapproaches

On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

Similar Items