Small nonlinearities in activation functions create bad local minima in neural networks

Small nonlinearities in activation functions create bad local minima in neural networks

© 7th International Conference on Learning Representations, ICLR 2019. All Rights Reserved. We investigate the loss surface of neural networks. We prove that even for one-hidden-layer networks with “slightest” nonlinearity, the empirical risks have spurious local minima in most cases. Our results th...

Full description

Bibliographic Details
Main Authors:	Yun, Chulee, Sra, Suvrit, Jadbabaie, Ali
Other Authors:	Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
Format:	Article
Language:	English
Published:	2021
Online Access:	https://hdl.handle.net/1721.1/137454

Similar Items

Efficiently testing local optimality and escaping saddles for RELu networks
by: Jadbabaie, Ali, et al.
Published: (2021)

Small ReLU networks are powerful memorizers: A tight analysis of memorization capacity
by: Yun, Chulhee, et al.
Published: (2021)

Are deep ResNets provably better than linear predictors?
by: Yun, Chulhee, et al.
Published: (2022)

Acceleration in First Order Quasi-strongly Convex Optimization by ODE Discretization
by: Zhang, Jingzhao, et al.
Published: (2021)

Elimination of All Bad Local Minima in Deep Learning
by: Kawaguchi, Kenji, et al.
Published: (2021)

Direct Runge-Kutta discretization achieves acceleration
by: Zhang, Jingzhao, et al.
Published: (2021)

Direct Runge-Kutta discretization achieves acceleration
by: Zhang, Jingzhao, et al.
Published: (2022)

Modular proximal optimization for multidimensional total-variation regularization
by: Sra, Suvrit
Published: (2021)

Conditional gradient methods via stochastic path-integrated differential estimator
by: Sra, Suvrit
Published: (2021)

Logarithmic inequalities under a symmetric polynomial dominance order
by: Sra, Suvrit
Published: (2021)

Logarithmic inequalities under a symmetric polynomial dominance order
by: Sra, Suvrit
Published: (2021)

New concavity and convexity results for symmetric polynomials and their ratios
by: Sra, Suvrit
Published: (2021)

New concavity and convexity results for symmetric polynomials and their ratios
by: Sra, Suvrit
Published: (2021)

Functions with Positive Differences on Convex Cones
by: Niculescu, Constantin P., et al.
Published: (2023)

Three Operator Splitting with a Nonconvex Loss Function
by: Yurtsever, Alp, et al.
Published: (2022)

Local boundedness for minima of functionals with nonstandard growth conditions
by: A. Dall’Aglio, et al.
Published: (1998-01-01)

An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization
by: Hosseini, Reshad, et al.
Published: (2021)

Riemannian Optimization via Frank-Wolfe Methods
by: Weber, Melanie, et al.
Published: (2022)

Projection-free nonconvex stochastic optimization on Riemannian manifolds
by: Weber, Melanie, et al.
Published: (2022)

Random shuffling beats SGD after finite epochs
by: HaoChen, Jeff, et al.
Published: (2021)

Kronecker determinantal point processes
by: Mariet, Zelda Elaine, et al.
Published: (2022)

Local Regularity Results for Minima of Anisotropic Functionals and Solutions of Anisotropic Equations
by: Yuming Chu, et al.
Published: (2008-01-01)

Deep Learning without Poor Local Minima
by: Kawaguchi, Kenji
Published: (2016)

Minima
by: Fellipe Eloy Teixeira Albuquerque
Published: (2018-12-01)

Learning determinantal point processes by corrective negative sampling
by: Mariet, Zelda, et al.
Published: (2021)

The singular set of minima of integral functionals
by: Kristensen, J, et al.
Published: (2006)

On the regularity of the ω-minima of φ-functionals
by: De Filippis, C
Published: (2019)

On the regularity of minima of non-autonomous functionals
by: De Filippis, C, et al.
Published: (2019)

Synchonization in oscillator networks with heterogeneous delays, switching topologies and nonlinear dynamics
by: Papachristodoulou, A, et al.
Published: (2006)

Effect of Depth and Width on Local Minima in Deep Learning
by: Kawaguchi, Kenji, et al.
Published: (2021)

Escaping Local Minima via Appraisal Driven Responses
by: Malte Rørmose Damgaard, et al.
Published: (2022-12-01)

Effect of Depth and Width on Local Minima in Deep Learning
by: Kawaguchi, Kenji, et al.
Published: (2022)

Relations between the mountain pass theorem and local minima
by: Bonanno Gabriele
Published: (2012-08-01)

Fast mixing Markov chains for strongly rayleigh measures, DPPs, and constrained sampling
by: Li, Chengtao, et al.
Published: (2021)

BadNets: Evaluating Backdooring Attacks on Deep Neural Networks
by: Tianyu Gu, et al.
Published: (2019-01-01)

Maxima and minima/
by: 416750 Levenson, Morris E.
Published: (1967)

Mínima Mitológica
by: Rosalba Campra
Published: (2013-07-01)

Concentrações mínimas
by: Cícero Carlos de Freitas, et al.
Published: (1991-04-01)

Mínima Maximiliana
by: Raul Antelo
Published: (2023-02-01)

Cerimônias mínimas
by: Mercedes Minnicelli
Published: (2011-12-01)