Small steps and giant leaps: minimal newton solvers for deep learning

We propose a fast second-order method that can be used as a drop-in replacement for current deep learning solvers. Compared to stochastic gradient descent (SGD), it only requires two additional forward-mode automatic differentiation operations per iteration, which has a computational cost comparable...

Full description

Bibliographic Details
Main Authors: Henriques, J, Ehrhardt, S, Albanie, S, Vedaldi, A
Format: Conference item
Language:English
Published: IEEE 2020