Effect of Depth and Width on Local Minima in Deep Learning

© 2019 Massachusetts Institute of Technology. For nonconvex optimization in machine learning, this article proves that every local minimum achieves the globally optimal value of the perturbable gradient basis model at any differentiable point. As a result, nonconvex machine learning is theoretically...

Full description

Bibliographic Details
Main Authors: Kawaguchi, Kenji, Huang, Jiaoyang, Kaelbling, Leslie P
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: MIT Press - Journals 2022
Online Access:https://hdl.handle.net/1721.1/136181.2