Escaping Saddle Points with Adaptive Gradient Methods

© 2019 International Machine Learning Society (IMLS). Adaptive methods such as Adam and RMSProp are widely used in deep learning but are not well understood. In this paper, we seek a crisp, clean and precise characterization of their behavior in nonconvex settings. To this end, we first provide a no...

Full description

Bibliographic Details
Main Authors: Staib, Matthew, Reddi, Sashank, Kale, Satyen, Kumar, Sanjiv, Sra, Suvrit
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:English
Published: 2021
Online Access:https://hdl.handle.net/1721.1/137532