Learning rate selection in stochastic gradient methods based on line search strategies

Finite-sum problems appear as the sample average approximation of a stochastic optimization problem and often arise in machine learning applications with large scale data sets. A very popular approach to face finite-sum problems is the stochastic gradient method. It is well known that a proper strat...

Full description

Bibliographic Details
Main Authors: Giorgia Franchini, Federica Porta, Valeria Ruggiero, Ilaria Trombini, Luca Zanni
Format: Article
Language:English
Published: Taylor & Francis Group 2023-12-01
Series:Applied Mathematics in Science and Engineering
Subjects:
Online Access:http://dx.doi.org/10.1080/27690911.2022.2164000
_version_ 1797641025142915072
author Giorgia Franchini
Federica Porta
Valeria Ruggiero
Ilaria Trombini
Luca Zanni
author_facet Giorgia Franchini
Federica Porta
Valeria Ruggiero
Ilaria Trombini
Luca Zanni
author_sort Giorgia Franchini
collection DOAJ
description Finite-sum problems appear as the sample average approximation of a stochastic optimization problem and often arise in machine learning applications with large scale data sets. A very popular approach to face finite-sum problems is the stochastic gradient method. It is well known that a proper strategy to select the hyperparameters of this method (i.e. the set of a-priori selected parameters) and, in particular, the learning rate, is needed to guarantee convergence properties and good practical performance. In this paper, we analyse standard and line search based updating rules to fix the learning rate sequence, also in relation to the size of the mini batch chosen to compute the current stochastic gradient. An extensive numerical experimentation is carried out in order to evaluate the effectiveness of the discussed strategies for convex and non-convex finite-sum test problems, highlighting that the line search based methods avoid expensive initial setting of the hyperparameters. The line search based approaches have also been applied to train a Convolutional Neural Network, providing very promising results.
first_indexed 2024-03-11T13:39:39Z
format Article
id doaj.art-a842c4d1210c4b04a3d5361b910b62aa
institution Directory Open Access Journal
issn 2769-0911
language English
last_indexed 2024-03-11T13:39:39Z
publishDate 2023-12-01
publisher Taylor & Francis Group
record_format Article
series Applied Mathematics in Science and Engineering
spelling doaj.art-a842c4d1210c4b04a3d5361b910b62aa2023-11-02T13:48:31ZengTaylor & Francis GroupApplied Mathematics in Science and Engineering2769-09112023-12-0131110.1080/27690911.2022.21640002164000Learning rate selection in stochastic gradient methods based on line search strategiesGiorgia Franchini0Federica Porta1Valeria Ruggiero2Ilaria Trombini3Luca Zanni4University of Modena and Reggio EmiliaUniversity of Modena and Reggio EmiliaUniversity of FerraraUniversity of FerraraUniversity of Modena and Reggio EmiliaFinite-sum problems appear as the sample average approximation of a stochastic optimization problem and often arise in machine learning applications with large scale data sets. A very popular approach to face finite-sum problems is the stochastic gradient method. It is well known that a proper strategy to select the hyperparameters of this method (i.e. the set of a-priori selected parameters) and, in particular, the learning rate, is needed to guarantee convergence properties and good practical performance. In this paper, we analyse standard and line search based updating rules to fix the learning rate sequence, also in relation to the size of the mini batch chosen to compute the current stochastic gradient. An extensive numerical experimentation is carried out in order to evaluate the effectiveness of the discussed strategies for convex and non-convex finite-sum test problems, highlighting that the line search based methods avoid expensive initial setting of the hyperparameters. The line search based approaches have also been applied to train a Convolutional Neural Network, providing very promising results.http://dx.doi.org/10.1080/27690911.2022.2164000stochastic gradient methodsvariance reduced methodslearning rate selectionmini batch size selectionconvolutional neural networks
spellingShingle Giorgia Franchini
Federica Porta
Valeria Ruggiero
Ilaria Trombini
Luca Zanni
Learning rate selection in stochastic gradient methods based on line search strategies
Applied Mathematics in Science and Engineering
stochastic gradient methods
variance reduced methods
learning rate selection
mini batch size selection
convolutional neural networks
title Learning rate selection in stochastic gradient methods based on line search strategies
title_full Learning rate selection in stochastic gradient methods based on line search strategies
title_fullStr Learning rate selection in stochastic gradient methods based on line search strategies
title_full_unstemmed Learning rate selection in stochastic gradient methods based on line search strategies
title_short Learning rate selection in stochastic gradient methods based on line search strategies
title_sort learning rate selection in stochastic gradient methods based on line search strategies
topic stochastic gradient methods
variance reduced methods
learning rate selection
mini batch size selection
convolutional neural networks
url http://dx.doi.org/10.1080/27690911.2022.2164000
work_keys_str_mv AT giorgiafranchini learningrateselectioninstochasticgradientmethodsbasedonlinesearchstrategies
AT federicaporta learningrateselectioninstochasticgradientmethodsbasedonlinesearchstrategies
AT valeriaruggiero learningrateselectioninstochasticgradientmethodsbasedonlinesearchstrategies
AT ilariatrombini learningrateselectioninstochasticgradientmethodsbasedonlinesearchstrategies
AT lucazanni learningrateselectioninstochasticgradientmethodsbasedonlinesearchstrategies