Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy

Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs’ generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/de...

Full description

Bibliographic Details
Main Authors:	Kristoffer K. Wickstrøm, Sigurd Løkse, Michael C. Kampffmeyer, Shujian Yu, José C. Príncipe, Robert Jenssen
Format:	Article
Language:	English
Published:	MDPI AG 2023-06-01
Series:	Entropy
Subjects:	information theory deep learning information plane kernels methods
Online Access:	https://www.mdpi.com/1099-4300/25/6/899

_version_	1827737524971241472
author	Kristoffer K. Wickstrøm Sigurd Løkse Michael C. Kampffmeyer Shujian Yu José C. Príncipe Robert Jenssen
author_facet	Kristoffer K. Wickstrøm Sigurd Løkse Michael C. Kampffmeyer Shujian Yu José C. Príncipe Robert Jenssen
author_sort	Kristoffer K. Wickstrøm
collection	DOAJ
description	Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs’ generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/desired output to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness toward the high dimensionality associated with such layers. MI estimators should also be able to handle convolutional layers while at the same time being computationally tractable to scale to large networks. Existing IP methods have not been able to study truly deep convolutional neural networks (CNNs). We propose an IP analysis using the new matrix-based Rényi’s entropy coupled with tensor kernels, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. Our results shed new light on previous studies concerning small-scale DNNs using a completely new approach. We provide a comprehensive IP analysis of large-scale CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.
first_indexed	2024-03-11T02:30:07Z
format	Article
id	doaj.art-d2fedafe6c2149e394b75640cdb3b2fe
institution	Directory Open Access Journal
issn	1099-4300
language	English
last_indexed	2024-03-11T02:30:07Z
publishDate	2023-06-01
publisher	MDPI AG
record_format	Article
series	Entropy
spelling	doaj.art-d2fedafe6c2149e394b75640cdb3b2fe2023-11-18T10:17:59ZengMDPI AGEntropy1099-43002023-06-0125689910.3390/e25060899Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based EntropyKristoffer K. Wickstrøm0Sigurd Løkse1Michael C. Kampffmeyer2Shujian Yu3José C. Príncipe4Robert Jenssen5Machine Learning Group, Department of Physics and Technology, UiT The Arctic University of Norway, NO-9037 Tromsø, NorwayMachine Learning Group, Department of Physics and Technology, UiT The Arctic University of Norway, NO-9037 Tromsø, NorwayMachine Learning Group, Department of Physics and Technology, UiT The Arctic University of Norway, NO-9037 Tromsø, NorwayMachine Learning Group, Department of Physics and Technology, UiT The Arctic University of Norway, NO-9037 Tromsø, NorwayComputational NeuroEngineering Laboratory, Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611, USAMachine Learning Group, Department of Physics and Technology, UiT The Arctic University of Norway, NO-9037 Tromsø, NorwayAnalyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently to gain insight into, among others, DNNs’ generalization ability. However, it is by no means obvious how to estimate the mutual information (MI) between each hidden layer and the input/desired output to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness toward the high dimensionality associated with such layers. MI estimators should also be able to handle convolutional layers while at the same time being computationally tractable to scale to large networks. Existing IP methods have not been able to study truly deep convolutional neural networks (CNNs). We propose an IP analysis using the new matrix-based Rényi’s entropy coupled with tensor kernels, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. Our results shed new light on previous studies concerning small-scale DNNs using a completely new approach. We provide a comprehensive IP analysis of large-scale CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.https://www.mdpi.com/1099-4300/25/6/899information theorydeep learninginformation planekernels methods
spellingShingle	Kristoffer K. Wickstrøm Sigurd Løkse Michael C. Kampffmeyer Shujian Yu José C. Príncipe Robert Jenssen Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy Entropy information theory deep learning information plane kernels methods
title	Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy
title_full	Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy
title_fullStr	Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy
title_full_unstemmed	Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy
title_short	Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy
title_sort	analysis of deep convolutional neural networks using tensor kernels and matrix based entropy
topic	information theory deep learning information plane kernels methods
url	https://www.mdpi.com/1099-4300/25/6/899
work_keys_str_mv	AT kristofferkwickstrøm analysisofdeepconvolutionalneuralnetworksusingtensorkernelsandmatrixbasedentropy AT sigurdløkse analysisofdeepconvolutionalneuralnetworksusingtensorkernelsandmatrixbasedentropy AT michaelckampffmeyer analysisofdeepconvolutionalneuralnetworksusingtensorkernelsandmatrixbasedentropy AT shujianyu analysisofdeepconvolutionalneuralnetworksusingtensorkernelsandmatrixbasedentropy AT josecprincipe analysisofdeepconvolutionalneuralnetworksusingtensorkernelsandmatrixbasedentropy AT robertjenssen analysisofdeepconvolutionalneuralnetworksusingtensorkernelsandmatrixbasedentropy

Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy

Similar Items