Zeroth, first, and second-order phase transitions in deep neural networks

We investigate deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transit...

Full description

Bibliographic Details
Main Authors: Liu Ziyin, Masahito Ueda
Format: Article
Language:English
Published: American Physical Society 2023-12-01
Series:Physical Review Research
Online Access:http://doi.org/10.1103/PhysRevResearch.5.043243
_version_ 1797210308866998272
author Liu Ziyin
Masahito Ueda
author_facet Liu Ziyin
Masahito Ueda
author_sort Liu Ziyin
collection DOAJ
description We investigate deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transition for deep linear nets with one hidden layer and the first-order phase transition for nets with more than one hidden layer. We also prove the linear origin theorem, which states that common deep nonlinear models are equivalent to a linear network of the same depth and connection structure close to the origin. Therefore, the proposed theory is directly relevant to understanding the optimization and initialization of neural networks and serves as a minimal model of the ubiquitous collapse phenomenon in deep learning.
first_indexed 2024-04-24T10:08:32Z
format Article
id doaj.art-600b2da6a3504fcb8d622c11d492458c
institution Directory Open Access Journal
issn 2643-1564
language English
last_indexed 2024-04-24T10:08:32Z
publishDate 2023-12-01
publisher American Physical Society
record_format Article
series Physical Review Research
spelling doaj.art-600b2da6a3504fcb8d622c11d492458c2024-04-12T17:37:00ZengAmerican Physical SocietyPhysical Review Research2643-15642023-12-015404324310.1103/PhysRevResearch.5.043243Zeroth, first, and second-order phase transitions in deep neural networksLiu ZiyinMasahito UedaWe investigate deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transition for deep linear nets with one hidden layer and the first-order phase transition for nets with more than one hidden layer. We also prove the linear origin theorem, which states that common deep nonlinear models are equivalent to a linear network of the same depth and connection structure close to the origin. Therefore, the proposed theory is directly relevant to understanding the optimization and initialization of neural networks and serves as a minimal model of the ubiquitous collapse phenomenon in deep learning.http://doi.org/10.1103/PhysRevResearch.5.043243
spellingShingle Liu Ziyin
Masahito Ueda
Zeroth, first, and second-order phase transitions in deep neural networks
Physical Review Research
title Zeroth, first, and second-order phase transitions in deep neural networks
title_full Zeroth, first, and second-order phase transitions in deep neural networks
title_fullStr Zeroth, first, and second-order phase transitions in deep neural networks
title_full_unstemmed Zeroth, first, and second-order phase transitions in deep neural networks
title_short Zeroth, first, and second-order phase transitions in deep neural networks
title_sort zeroth first and second order phase transitions in deep neural networks
url http://doi.org/10.1103/PhysRevResearch.5.043243
work_keys_str_mv AT liuziyin zerothfirstandsecondorderphasetransitionsindeepneuralnetworks
AT masahitoueda zerothfirstandsecondorderphasetransitionsindeepneuralnetworks