Summary: | In the field of neural network, stochastic gradient descent is often employed as an effective method of accelerating the result's convergence. Generating the new gradient from the past gradient is a common method adopted by many existing optimization algorithms. Since the past gradient is not computed based on the most updated stochastic gradient descent state, it can introduce a deviation to the new gradient computation, negatively impacting its rate of convergence. To resolve this problem, we propose an algorithm that quantifies this deviation based on the angle between the past and the current gradients, which is then applied to calibrate these two gradients, generating a more accurate new gradient. To demonstrate the broad applicability of the algorithm, the proposed method is implemented into a neural network and a logistic regression classifier which are evaluated on the datasets MNIST and NSL-KDD, respectively. An in-depth analysis is performed to compare our algorithm with nine optimization algorithms in two experiments, demonstrating the advantages in the cost and the error rate reductions from adopting the proposed method.
|