Small steps and giant leaps: minimal newton solvers for deep learning
We propose a fast second-order method that can be used as a drop-in replacement for current deep learning solvers. Compared to stochastic gradient descent (SGD), it only requires two additional forward-mode automatic differentiation operations per iteration, which has a computational cost comparable...
Main Authors: | , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
IEEE
2020
|