Convergence rates for the stochastic gradient descent method for non-convex objective functions
We prove the convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily locally convex nor contracting objective functions. In particular, the analysis relies on a quantitative use of mini-batches to control the loss of it...
Main Authors: | , , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
Journal of Machine Learning Research
2020
|
_version_ | 1826312170442326016 |
---|---|
author | Fehrman, B Gess, B Jentzen, A |
author_facet | Fehrman, B Gess, B Jentzen, A |
author_sort | Fehrman, B |
collection | OXFORD |
description | We prove the convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily locally convex nor contracting objective functions. In particular, the analysis relies on a quantitative use of mini-batches to control the loss of iterates to non-attracted regions. The applicability of the results to simple objective functions arising in machine learning is shown. |
first_indexed | 2024-03-07T08:23:32Z |
format | Journal article |
id | oxford-uuid:a2417726-ba38-4a09-8181-2e31e97587fe |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T08:23:32Z |
publishDate | 2020 |
publisher | Journal of Machine Learning Research |
record_format | dspace |
spelling | oxford-uuid:a2417726-ba38-4a09-8181-2e31e97587fe2024-02-06T10:55:45ZConvergence rates for the stochastic gradient descent method for non-convex objective functionsJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:a2417726-ba38-4a09-8181-2e31e97587feEnglishSymplectic ElementsJournal of Machine Learning Research2020Fehrman, BGess, BJentzen, AWe prove the convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily locally convex nor contracting objective functions. In particular, the analysis relies on a quantitative use of mini-batches to control the loss of iterates to non-attracted regions. The applicability of the results to simple objective functions arising in machine learning is shown. |
spellingShingle | Fehrman, B Gess, B Jentzen, A Convergence rates for the stochastic gradient descent method for non-convex objective functions |
title | Convergence rates for the stochastic gradient descent method for non-convex objective functions |
title_full | Convergence rates for the stochastic gradient descent method for non-convex objective functions |
title_fullStr | Convergence rates for the stochastic gradient descent method for non-convex objective functions |
title_full_unstemmed | Convergence rates for the stochastic gradient descent method for non-convex objective functions |
title_short | Convergence rates for the stochastic gradient descent method for non-convex objective functions |
title_sort | convergence rates for the stochastic gradient descent method for non convex objective functions |
work_keys_str_mv | AT fehrmanb convergenceratesforthestochasticgradientdescentmethodfornonconvexobjectivefunctions AT gessb convergenceratesforthestochasticgradientdescentmethodfornonconvexobjectivefunctions AT jentzena convergenceratesforthestochasticgradientdescentmethodfornonconvexobjectivefunctions |