Information-bottleneck under mean field initialization

This work explores the sensitivity of mutual information (MI) flow in hidden layers of very deep neural networks (DNNs) as a function of the initialization variance. Specifically, we demonstrate that information-bottleneck (IB) interpretations of DNNs are significantly affected by their choice of no...

Full description

Bibliographic Details
Main Authors: Abrol, V, Tanner, J
Format: Conference item
Language:English
Published: 2020
_version_ 1797096729756041216
author Abrol, V
Tanner, J
author_facet Abrol, V
Tanner, J
author_sort Abrol, V
collection OXFORD
description This work explores the sensitivity of mutual information (MI) flow in hidden layers of very deep neural networks (DNNs) as a function of the initialization variance. Specifically, we demonstrate that information-bottleneck (IB) interpretations of DNNs are significantly affected by their choice of nonlinearity as well as weight and bias variances. Initialization on the network mean field (MF) edge of chaos (EOC) results in maximal information propagation through layers of even DNNs; consequently their IB plots are effectively single points which do not vary and high accuracy is rapidly obtained with training. Alternatively, initialization away from EOC results in loss of MI through depth and the more characteristic IB plots observed in the literature. We also demonstrate that popular MI estimators give substantially different estimates, especially for sigmoidal nonlinearity and high weight variance.
first_indexed 2024-03-07T04:45:42Z
format Conference item
id oxford-uuid:d32dfc7c-b20e-426f-bd6b-8f6b15a3e378
institution University of Oxford
language English
last_indexed 2024-03-07T04:45:42Z
publishDate 2020
record_format dspace
spelling oxford-uuid:d32dfc7c-b20e-426f-bd6b-8f6b15a3e3782022-03-27T08:09:31ZInformation-bottleneck under mean field initializationConference itemhttp://purl.org/coar/resource_type/c_5794uuid:d32dfc7c-b20e-426f-bd6b-8f6b15a3e378EnglishSymplectic Elements2020Abrol, VTanner, JThis work explores the sensitivity of mutual information (MI) flow in hidden layers of very deep neural networks (DNNs) as a function of the initialization variance. Specifically, we demonstrate that information-bottleneck (IB) interpretations of DNNs are significantly affected by their choice of nonlinearity as well as weight and bias variances. Initialization on the network mean field (MF) edge of chaos (EOC) results in maximal information propagation through layers of even DNNs; consequently their IB plots are effectively single points which do not vary and high accuracy is rapidly obtained with training. Alternatively, initialization away from EOC results in loss of MI through depth and the more characteristic IB plots observed in the literature. We also demonstrate that popular MI estimators give substantially different estimates, especially for sigmoidal nonlinearity and high weight variance.
spellingShingle Abrol, V
Tanner, J
Information-bottleneck under mean field initialization
title Information-bottleneck under mean field initialization
title_full Information-bottleneck under mean field initialization
title_fullStr Information-bottleneck under mean field initialization
title_full_unstemmed Information-bottleneck under mean field initialization
title_short Information-bottleneck under mean field initialization
title_sort information bottleneck under mean field initialization
work_keys_str_mv AT abrolv informationbottleneckundermeanfieldinitialization
AT tannerj informationbottleneckundermeanfieldinitialization