An Optimization Strategy Based on Hybrid Algorithm of Adam and SGD

Despite superior training outcomes, adaptive optimization methods such as Adam, Adagrad or RMSprop have been found to generalize poorly compared to stochastic gradient descent (SGD). So scholars (Nitish Shirish Keskar et al., 2017) proposed a hybrid strategy to start training with Adam and switch to...

Full description

Bibliographic Details
Main Authors: Wang Yijun, Zhou Pengyu, Zhong Wenya
Format: Article
Language:English
Published: EDP Sciences 2018-01-01
Series:MATEC Web of Conferences
Online Access:https://doi.org/10.1051/matecconf/201823203007