AG-SGD: Angle-Based Stochastic Gradient Descent

In the field of neural network, stochastic gradient descent is often employed as an effective method of accelerating the result's convergence. Generating the new gradient from the past gradient is a common method adopted by many existing optimization algorithms. Since the past gradient is not c...

Full description

Bibliographic Details
Main Authors: Chongya Song, Alexander Pons, Kang Yen
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9343305/
Description
Summary:In the field of neural network, stochastic gradient descent is often employed as an effective method of accelerating the result's convergence. Generating the new gradient from the past gradient is a common method adopted by many existing optimization algorithms. Since the past gradient is not computed based on the most updated stochastic gradient descent state, it can introduce a deviation to the new gradient computation, negatively impacting its rate of convergence. To resolve this problem, we propose an algorithm that quantifies this deviation based on the angle between the past and the current gradients, which is then applied to calibrate these two gradients, generating a more accurate new gradient. To demonstrate the broad applicability of the algorithm, the proposed method is implemented into a neural network and a logistic regression classifier which are evaluated on the datasets MNIST and NSL-KDD, respectively. An in-depth analysis is performed to compare our algorithm with nine optimization algorithms in two experiments, demonstrating the advantages in the cost and the error rate reductions from adopting the proposed method.
ISSN:2169-3536