Information-bottleneck under mean field initialization
This work explores the sensitivity of mutual information (MI) flow in hidden layers of very deep neural networks (DNNs) as a function of the initialization variance. Specifically, we demonstrate that information-bottleneck (IB) interpretations of DNNs are significantly affected by their choice of no...
Main Authors: | , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
2020
|
_version_ | 1797096729756041216 |
---|---|
author | Abrol, V Tanner, J |
author_facet | Abrol, V Tanner, J |
author_sort | Abrol, V |
collection | OXFORD |
description | This work explores the sensitivity of mutual information (MI) flow in hidden layers of very deep
neural networks (DNNs) as a function of the initialization variance. Specifically, we demonstrate
that information-bottleneck (IB) interpretations
of DNNs are significantly affected by their choice
of nonlinearity as well as weight and bias variances. Initialization on the network mean field
(MF) edge of chaos (EOC) results in maximal
information propagation through layers of even
DNNs; consequently their IB plots are effectively
single points which do not vary and high accuracy
is rapidly obtained with training. Alternatively,
initialization away from EOC results in loss of
MI through depth and the more characteristic IB
plots observed in the literature. We also demonstrate that popular MI estimators give substantially different estimates, especially for sigmoidal
nonlinearity and high weight variance. |
first_indexed | 2024-03-07T04:45:42Z |
format | Conference item |
id | oxford-uuid:d32dfc7c-b20e-426f-bd6b-8f6b15a3e378 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T04:45:42Z |
publishDate | 2020 |
record_format | dspace |
spelling | oxford-uuid:d32dfc7c-b20e-426f-bd6b-8f6b15a3e3782022-03-27T08:09:31ZInformation-bottleneck under mean field initializationConference itemhttp://purl.org/coar/resource_type/c_5794uuid:d32dfc7c-b20e-426f-bd6b-8f6b15a3e378EnglishSymplectic Elements2020Abrol, VTanner, JThis work explores the sensitivity of mutual information (MI) flow in hidden layers of very deep neural networks (DNNs) as a function of the initialization variance. Specifically, we demonstrate that information-bottleneck (IB) interpretations of DNNs are significantly affected by their choice of nonlinearity as well as weight and bias variances. Initialization on the network mean field (MF) edge of chaos (EOC) results in maximal information propagation through layers of even DNNs; consequently their IB plots are effectively single points which do not vary and high accuracy is rapidly obtained with training. Alternatively, initialization away from EOC results in loss of MI through depth and the more characteristic IB plots observed in the literature. We also demonstrate that popular MI estimators give substantially different estimates, especially for sigmoidal nonlinearity and high weight variance. |
spellingShingle | Abrol, V Tanner, J Information-bottleneck under mean field initialization |
title | Information-bottleneck under mean field initialization |
title_full | Information-bottleneck under mean field initialization |
title_fullStr | Information-bottleneck under mean field initialization |
title_full_unstemmed | Information-bottleneck under mean field initialization |
title_short | Information-bottleneck under mean field initialization |
title_sort | information bottleneck under mean field initialization |
work_keys_str_mv | AT abrolv informationbottleneckundermeanfieldinitialization AT tannerj informationbottleneckundermeanfieldinitialization |