Small nonlinearities in activation functions create bad local minima in neural networks
© 7th International Conference on Learning Representations, ICLR 2019. All Rights Reserved. We investigate the loss surface of neural networks. We prove that even for one-hidden-layer networks with “slightest” nonlinearity, the empirical risks have spurious local minima in most cases. Our results th...
Main Authors: | Yun, Chulee, Sra, Suvrit, Jadbabaie, Ali |
---|---|
Other Authors: | Massachusetts Institute of Technology. Laboratory for Information and Decision Systems |
Format: | Article |
Language: | English |
Published: |
2021
|
Online Access: | https://hdl.handle.net/1721.1/137454 |
Similar Items
-
Efficiently testing local optimality and escaping saddles for RELu networks
by: Jadbabaie, Ali, et al.
Published: (2021) -
Small ReLU networks are powerful memorizers: A tight analysis of memorization capacity
by: Yun, Chulhee, et al.
Published: (2021) -
Are deep ResNets provably better than linear predictors?
by: Yun, Chulhee, et al.
Published: (2022) -
Acceleration in First Order Quasi-strongly Convex Optimization by ODE Discretization
by: Zhang, Jingzhao, et al.
Published: (2021) -
Elimination of All Bad Local Minima in Deep Learning
by: Kawaguchi, Kenji, et al.
Published: (2021)