Mean field analysis of deep neural networks

We analyze multilayer neural networks in the asymptotic regime of simultaneously (a) large network sizes and (b) large numbers of stochastic gradient descent training iterations. We rigorously establish the limiting behavior of the multilayer neural network output. The limit procedure is valid for a...

תיאור מלא

מידע ביבליוגרפי
Main Authors: Sirignano, J, Spiliopoulos, K
פורמט: Journal article
שפה:English
יצא לאור: INFORMS 2021
_version_ 1826307671216619520
author Sirignano, J
Spiliopoulos, K
author_facet Sirignano, J
Spiliopoulos, K
author_sort Sirignano, J
collection OXFORD
description We analyze multilayer neural networks in the asymptotic regime of simultaneously (a) large network sizes and (b) large numbers of stochastic gradient descent training iterations. We rigorously establish the limiting behavior of the multilayer neural network output. The limit procedure is valid for any number of hidden layers, and it naturally also describes the limiting behavior of the training loss. The ideas that we explore are to (a) take the limits of each hidden layer sequentially and (b) characterize the evolution of parameters in terms of their initialization. The limit satisfies a system of deterministic integro-differential equations. The proof uses methods from weak convergence and stochastic analysis. We show that, under suitable assumptions on the activation functions and the behavior for large times, the limit neural network recovers a global minimum (with zero loss for the objective function).
first_indexed 2024-03-07T07:06:38Z
format Journal article
id oxford-uuid:bac1063f-563f-4bd3-8a2e-5a3d02a1b5d7
institution University of Oxford
language English
last_indexed 2024-03-07T07:06:38Z
publishDate 2021
publisher INFORMS
record_format dspace
spelling oxford-uuid:bac1063f-563f-4bd3-8a2e-5a3d02a1b5d72022-05-12T10:42:52ZMean field analysis of deep neural networksJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:bac1063f-563f-4bd3-8a2e-5a3d02a1b5d7EnglishSymplectic ElementsINFORMS2021Sirignano, JSpiliopoulos, KWe analyze multilayer neural networks in the asymptotic regime of simultaneously (a) large network sizes and (b) large numbers of stochastic gradient descent training iterations. We rigorously establish the limiting behavior of the multilayer neural network output. The limit procedure is valid for any number of hidden layers, and it naturally also describes the limiting behavior of the training loss. The ideas that we explore are to (a) take the limits of each hidden layer sequentially and (b) characterize the evolution of parameters in terms of their initialization. The limit satisfies a system of deterministic integro-differential equations. The proof uses methods from weak convergence and stochastic analysis. We show that, under suitable assumptions on the activation functions and the behavior for large times, the limit neural network recovers a global minimum (with zero loss for the objective function).
spellingShingle Sirignano, J
Spiliopoulos, K
Mean field analysis of deep neural networks
title Mean field analysis of deep neural networks
title_full Mean field analysis of deep neural networks
title_fullStr Mean field analysis of deep neural networks
title_full_unstemmed Mean field analysis of deep neural networks
title_short Mean field analysis of deep neural networks
title_sort mean field analysis of deep neural networks
work_keys_str_mv AT sirignanoj meanfieldanalysisofdeepneuralnetworks
AT spiliopoulosk meanfieldanalysisofdeepneuralnetworks