Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks
This paper demonstrates a novel approach to training deep neural networks using a Mutual Information (MI)-driven, decaying Learning Rate (LR), Stochastic Gradient Descent (SGD) algorithm. MI between the output of the neural network and true outcomes is used to adaptively set the LR for the network,...
Main Author: | Shrihari Vasudevan |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-05-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/22/5/560 |
Similar Items
-
Damped Newton Stochastic Gradient Descent Method for Neural Networks Training
by: Jingcheng Zhou, et al.
Published: (2021-06-01) -
Recent Advances in Stochastic Gradient Descent in Deep Learning
by: Yingjie Tian, et al.
Published: (2023-01-01) -
Adaptive Stochastic Conjugate Gradient Optimization for Backpropagation Neural Networks
by: Ibrahim Abaker Targio Hashem, et al.
Published: (2024-01-01) -
A Geometric Interpretation of Stochastic Gradient Descent Using Diffusion Metrics
by: Rita Fioresi, et al.
Published: (2020-01-01) -
Adaptive Stochastic Gradient Descent Method for Convex and Non-Convex Optimization
by: Ruijuan Chen, et al.
Published: (2022-11-01)