Zeroth, first, and second-order phase transitions in deep neural networks
We investigate deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transit...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
American Physical Society
2023-12-01
|
Series: | Physical Review Research |
Online Access: | http://doi.org/10.1103/PhysRevResearch.5.043243 |
_version_ | 1797210308866998272 |
---|---|
author | Liu Ziyin Masahito Ueda |
author_facet | Liu Ziyin Masahito Ueda |
author_sort | Liu Ziyin |
collection | DOAJ |
description | We investigate deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transition for deep linear nets with one hidden layer and the first-order phase transition for nets with more than one hidden layer. We also prove the linear origin theorem, which states that common deep nonlinear models are equivalent to a linear network of the same depth and connection structure close to the origin. Therefore, the proposed theory is directly relevant to understanding the optimization and initialization of neural networks and serves as a minimal model of the ubiquitous collapse phenomenon in deep learning. |
first_indexed | 2024-04-24T10:08:32Z |
format | Article |
id | doaj.art-600b2da6a3504fcb8d622c11d492458c |
institution | Directory Open Access Journal |
issn | 2643-1564 |
language | English |
last_indexed | 2024-04-24T10:08:32Z |
publishDate | 2023-12-01 |
publisher | American Physical Society |
record_format | Article |
series | Physical Review Research |
spelling | doaj.art-600b2da6a3504fcb8d622c11d492458c2024-04-12T17:37:00ZengAmerican Physical SocietyPhysical Review Research2643-15642023-12-015404324310.1103/PhysRevResearch.5.043243Zeroth, first, and second-order phase transitions in deep neural networksLiu ZiyinMasahito UedaWe investigate deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transition for deep linear nets with one hidden layer and the first-order phase transition for nets with more than one hidden layer. We also prove the linear origin theorem, which states that common deep nonlinear models are equivalent to a linear network of the same depth and connection structure close to the origin. Therefore, the proposed theory is directly relevant to understanding the optimization and initialization of neural networks and serves as a minimal model of the ubiquitous collapse phenomenon in deep learning.http://doi.org/10.1103/PhysRevResearch.5.043243 |
spellingShingle | Liu Ziyin Masahito Ueda Zeroth, first, and second-order phase transitions in deep neural networks Physical Review Research |
title | Zeroth, first, and second-order phase transitions in deep neural networks |
title_full | Zeroth, first, and second-order phase transitions in deep neural networks |
title_fullStr | Zeroth, first, and second-order phase transitions in deep neural networks |
title_full_unstemmed | Zeroth, first, and second-order phase transitions in deep neural networks |
title_short | Zeroth, first, and second-order phase transitions in deep neural networks |
title_sort | zeroth first and second order phase transitions in deep neural networks |
url | http://doi.org/10.1103/PhysRevResearch.5.043243 |
work_keys_str_mv | AT liuziyin zerothfirstandsecondorderphasetransitionsindeepneuralnetworks AT masahitoueda zerothfirstandsecondorderphasetransitionsindeepneuralnetworks |