Mean field analysis of neural networks: a central limit theorem
We rigorously prove a central limit theorem for neural network models with a single hidden layer. The central limit theorem is proven in the asymptotic regime of simultaneously (A) large numbers of hidden units and (B) large numbers of stochastic gradient descent training iterations. Our result desc...
Main Authors: | , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
Elsevier
2019
|
_version_ | 1797087267558260736 |
---|---|
author | Sirignano, J Spiliopoulos, K |
author_facet | Sirignano, J Spiliopoulos, K |
author_sort | Sirignano, J |
collection | OXFORD |
description | We rigorously prove a central limit theorem for neural network models with a single hidden layer. The central limit theorem is proven in the asymptotic regime of simultaneously (A) large numbers of hidden units and (B) large numbers of stochastic gradient descent training iterations. Our result describes the neural network’s fluctuations around its mean-field limit. The fluctuations have a Gaussian distribution and satisfy a stochastic partial differential equation. The proof relies upon weak convergence methods from stochastic analysis. In particular, we prove relative compactness for the sequence of processes and uniqueness of the limiting process in a suitable Sobolev space. |
first_indexed | 2024-03-07T02:33:24Z |
format | Journal article |
id | oxford-uuid:a7ff2ea9-bf95-4094-8356-77cabffc8e08 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T02:33:24Z |
publishDate | 2019 |
publisher | Elsevier |
record_format | dspace |
spelling | oxford-uuid:a7ff2ea9-bf95-4094-8356-77cabffc8e082022-03-27T02:58:18ZMean field analysis of neural networks: a central limit theoremJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:a7ff2ea9-bf95-4094-8356-77cabffc8e08EnglishSymplectic ElementsElsevier2019Sirignano, JSpiliopoulos, KWe rigorously prove a central limit theorem for neural network models with a single hidden layer. The central limit theorem is proven in the asymptotic regime of simultaneously (A) large numbers of hidden units and (B) large numbers of stochastic gradient descent training iterations. Our result describes the neural network’s fluctuations around its mean-field limit. The fluctuations have a Gaussian distribution and satisfy a stochastic partial differential equation. The proof relies upon weak convergence methods from stochastic analysis. In particular, we prove relative compactness for the sequence of processes and uniqueness of the limiting process in a suitable Sobolev space. |
spellingShingle | Sirignano, J Spiliopoulos, K Mean field analysis of neural networks: a central limit theorem |
title | Mean field analysis of neural networks: a central limit theorem |
title_full | Mean field analysis of neural networks: a central limit theorem |
title_fullStr | Mean field analysis of neural networks: a central limit theorem |
title_full_unstemmed | Mean field analysis of neural networks: a central limit theorem |
title_short | Mean field analysis of neural networks: a central limit theorem |
title_sort | mean field analysis of neural networks a central limit theorem |
work_keys_str_mv | AT sirignanoj meanfieldanalysisofneuralnetworksacentrallimittheorem AT spiliopoulosk meanfieldanalysisofneuralnetworksacentrallimittheorem |