Zeroth, first, and second-order phase transitions in deep neural networks

We investigate deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transit...

Full description

Bibliographic Details
Main Authors:	Liu Ziyin, Masahito Ueda
Format:	Article
Language:	English
Published:	American Physical Society 2023-12-01
Series:	Physical Review Research
Online Access:	http://doi.org/10.1103/PhysRevResearch.5.043243

_version_	1797210308866998272
author	Liu Ziyin Masahito Ueda
author_facet	Liu Ziyin Masahito Ueda
author_sort	Liu Ziyin
collection	DOAJ
description	We investigate deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transition for deep linear nets with one hidden layer and the first-order phase transition for nets with more than one hidden layer. We also prove the linear origin theorem, which states that common deep nonlinear models are equivalent to a linear network of the same depth and connection structure close to the origin. Therefore, the proposed theory is directly relevant to understanding the optimization and initialization of neural networks and serves as a minimal model of the ubiquitous collapse phenomenon in deep learning.
first_indexed	2024-04-24T10:08:32Z
format	Article
id	doaj.art-600b2da6a3504fcb8d622c11d492458c
institution	Directory Open Access Journal
issn	2643-1564
language	English
last_indexed	2024-04-24T10:08:32Z
publishDate	2023-12-01
publisher	American Physical Society
record_format	Article
series	Physical Review Research
spelling	doaj.art-600b2da6a3504fcb8d622c11d492458c2024-04-12T17:37:00ZengAmerican Physical SocietyPhysical Review Research2643-15642023-12-015404324310.1103/PhysRevResearch.5.043243Zeroth, first, and second-order phase transitions in deep neural networksLiu ZiyinMasahito UedaWe investigate deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transition for deep linear nets with one hidden layer and the first-order phase transition for nets with more than one hidden layer. We also prove the linear origin theorem, which states that common deep nonlinear models are equivalent to a linear network of the same depth and connection structure close to the origin. Therefore, the proposed theory is directly relevant to understanding the optimization and initialization of neural networks and serves as a minimal model of the ubiquitous collapse phenomenon in deep learning.http://doi.org/10.1103/PhysRevResearch.5.043243
spellingShingle	Liu Ziyin Masahito Ueda Zeroth, first, and second-order phase transitions in deep neural networks Physical Review Research
title	Zeroth, first, and second-order phase transitions in deep neural networks
title_full	Zeroth, first, and second-order phase transitions in deep neural networks
title_fullStr	Zeroth, first, and second-order phase transitions in deep neural networks
title_full_unstemmed	Zeroth, first, and second-order phase transitions in deep neural networks
title_short	Zeroth, first, and second-order phase transitions in deep neural networks
title_sort	zeroth first and second order phase transitions in deep neural networks
url	http://doi.org/10.1103/PhysRevResearch.5.043243
work_keys_str_mv	AT liuziyin zerothfirstandsecondorderphasetransitionsindeepneuralnetworks AT masahitoueda zerothfirstandsecondorderphasetransitionsindeepneuralnetworks

Zeroth, first, and second-order phase transitions in deep neural networks

Similar Items