An optimized second order stochastic learning algorithm for neural network training

This paper proposes an improved stochastic second order learning algorithm for supervised neural network training. The proposed algorithm, named bounded stochastic diagonal Levenberg-Marquardt (B-SDLM), utilizes both gradient and curvature information to achieve fast convergence while requiring only...

Full description

Bibliographic Details
Main Authors:	Liew, S. S., Khalil-Hani, M., Bakhteri, R.
Format:	Article
Published:	Elsevier B.V. 2016
Subjects:	TK Electrical engineering. Electronics Nuclear engineering

_version_	1796861960471445504
author	Liew, S. S. Khalil-Hani, M. Bakhteri, R.
author_facet	Liew, S. S. Khalil-Hani, M. Bakhteri, R.
author_sort	Liew, S. S.
collection	ePrints
description	This paper proposes an improved stochastic second order learning algorithm for supervised neural network training. The proposed algorithm, named bounded stochastic diagonal Levenberg-Marquardt (B-SDLM), utilizes both gradient and curvature information to achieve fast convergence while requiring only minimal computational overhead than the stochastic gradient descent (SGD) method. B-SDLM has only a single hyperparameter as opposed to most other learning algorithms that suffer from the hyperparameter overfitting problem due to having more hyperparameters to be tuned. Experiments using the multilayer perceptron (MLP) and convolutional neural network (CNN) models have shown that B-SDLM outperforms other learning algorithms with regard to the classification accuracies and computational efficiency (about 5.3% faster than SGD on the mnist-rot-bg-img database). It can classify all testing samples correctly on the face recognition case study based on AR Purdue database. In addition, experiments on handwritten digit classification case studies show that significant improvements of 19.6% on MNIST database and 17.5% on mnist-rot-bg-img database can be achieved in terms of the testing misclassification error rates (MCRs). The computationally expensive Hessian calculations are kept to a minimum by using just 0.05% of the training samples in its estimation or updating the learning rates once per two training epochs, while maintaining or even achieving lower testing MCRs. It is also shown that B-SDLM works well in the mini-batch learning mode, and we are able to achieve 3.32× performance speedup when deploying the proposed algorithm in a distributed learning environment with a quad-core processor.
first_indexed	2024-03-05T20:04:20Z
format	Article
id	utm.eprints-72624
institution	Universiti Teknologi Malaysia - ePrints
last_indexed	2024-03-05T20:04:20Z
publishDate	2016
publisher	Elsevier B.V.
record_format	dspace
spelling	utm.eprints-726242017-11-27T04:42:57Z http://eprints.utm.my/72624/ An optimized second order stochastic learning algorithm for neural network training Liew, S. S. Khalil-Hani, M. Bakhteri, R. TK Electrical engineering. Electronics Nuclear engineering This paper proposes an improved stochastic second order learning algorithm for supervised neural network training. The proposed algorithm, named bounded stochastic diagonal Levenberg-Marquardt (B-SDLM), utilizes both gradient and curvature information to achieve fast convergence while requiring only minimal computational overhead than the stochastic gradient descent (SGD) method. B-SDLM has only a single hyperparameter as opposed to most other learning algorithms that suffer from the hyperparameter overfitting problem due to having more hyperparameters to be tuned. Experiments using the multilayer perceptron (MLP) and convolutional neural network (CNN) models have shown that B-SDLM outperforms other learning algorithms with regard to the classification accuracies and computational efficiency (about 5.3% faster than SGD on the mnist-rot-bg-img database). It can classify all testing samples correctly on the face recognition case study based on AR Purdue database. In addition, experiments on handwritten digit classification case studies show that significant improvements of 19.6% on MNIST database and 17.5% on mnist-rot-bg-img database can be achieved in terms of the testing misclassification error rates (MCRs). The computationally expensive Hessian calculations are kept to a minimum by using just 0.05% of the training samples in its estimation or updating the learning rates once per two training epochs, while maintaining or even achieving lower testing MCRs. It is also shown that B-SDLM works well in the mini-batch learning mode, and we are able to achieve 3.32× performance speedup when deploying the proposed algorithm in a distributed learning environment with a quad-core processor. Elsevier B.V. 2016 Article PeerReviewed Liew, S. S. and Khalil-Hani, M. and Bakhteri, R. (2016) An optimized second order stochastic learning algorithm for neural network training. Neurocomputing, 186 . pp. 74-89. ISSN 0925-2312 https://www.scopus.com/inward/record.uri?eid=2-s2.0-84954287399&doi=10.1016%2fj.neucom.2015.12.076&partnerID=40&md5=ff2533ba41bd7889b43e8d2f164b4f27
spellingShingle	TK Electrical engineering. Electronics Nuclear engineering Liew, S. S. Khalil-Hani, M. Bakhteri, R. An optimized second order stochastic learning algorithm for neural network training
title	An optimized second order stochastic learning algorithm for neural network training
title_full	An optimized second order stochastic learning algorithm for neural network training
title_fullStr	An optimized second order stochastic learning algorithm for neural network training
title_full_unstemmed	An optimized second order stochastic learning algorithm for neural network training
title_short	An optimized second order stochastic learning algorithm for neural network training
title_sort	optimized second order stochastic learning algorithm for neural network training
topic	TK Electrical engineering. Electronics Nuclear engineering
work_keys_str_mv	AT liewss anoptimizedsecondorderstochasticlearningalgorithmforneuralnetworktraining AT khalilhanim anoptimizedsecondorderstochasticlearningalgorithmforneuralnetworktraining AT bakhterir anoptimizedsecondorderstochasticlearningalgorithmforneuralnetworktraining AT liewss optimizedsecondorderstochasticlearningalgorithmforneuralnetworktraining AT khalilhanim optimizedsecondorderstochasticlearningalgorithmforneuralnetworktraining AT bakhterir optimizedsecondorderstochasticlearningalgorithmforneuralnetworktraining

An optimized second order stochastic learning algorithm for neural network training

Similar Items