Convergence rates for the stochastic gradient descent method for non-convex objective functions

We prove the convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily locally convex nor contracting objective functions. In particular, the analysis relies on a quantitative use of mini-batches to control the loss of it...

Full description

Bibliographic Details
Main Authors: Fehrman, B, Gess, B, Jentzen, A
Format: Journal article
Language:English
Published: Journal of Machine Learning Research 2020
_version_ 1826312170442326016
author Fehrman, B
Gess, B
Jentzen, A
author_facet Fehrman, B
Gess, B
Jentzen, A
author_sort Fehrman, B
collection OXFORD
description We prove the convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily locally convex nor contracting objective functions. In particular, the analysis relies on a quantitative use of mini-batches to control the loss of iterates to non-attracted regions. The applicability of the results to simple objective functions arising in machine learning is shown.
first_indexed 2024-03-07T08:23:32Z
format Journal article
id oxford-uuid:a2417726-ba38-4a09-8181-2e31e97587fe
institution University of Oxford
language English
last_indexed 2024-03-07T08:23:32Z
publishDate 2020
publisher Journal of Machine Learning Research
record_format dspace
spelling oxford-uuid:a2417726-ba38-4a09-8181-2e31e97587fe2024-02-06T10:55:45ZConvergence rates for the stochastic gradient descent method for non-convex objective functionsJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:a2417726-ba38-4a09-8181-2e31e97587feEnglishSymplectic ElementsJournal of Machine Learning Research2020Fehrman, BGess, BJentzen, AWe prove the convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily locally convex nor contracting objective functions. In particular, the analysis relies on a quantitative use of mini-batches to control the loss of iterates to non-attracted regions. The applicability of the results to simple objective functions arising in machine learning is shown.
spellingShingle Fehrman, B
Gess, B
Jentzen, A
Convergence rates for the stochastic gradient descent method for non-convex objective functions
title Convergence rates for the stochastic gradient descent method for non-convex objective functions
title_full Convergence rates for the stochastic gradient descent method for non-convex objective functions
title_fullStr Convergence rates for the stochastic gradient descent method for non-convex objective functions
title_full_unstemmed Convergence rates for the stochastic gradient descent method for non-convex objective functions
title_short Convergence rates for the stochastic gradient descent method for non-convex objective functions
title_sort convergence rates for the stochastic gradient descent method for non convex objective functions
work_keys_str_mv AT fehrmanb convergenceratesforthestochasticgradientdescentmethodfornonconvexobjectivefunctions
AT gessb convergenceratesforthestochasticgradientdescentmethodfornonconvexobjectivefunctions
AT jentzena convergenceratesforthestochasticgradientdescentmethodfornonconvexobjectivefunctions