Carathéodory sampling for stochastic gradient descent
Many problems require to optimize empirical risk functions over large data sets. Gradient descent methods that calculate the full gradient in every descent step do not scale to such datasets. Various flavours of Stochastic Gradient Descent (SGD) replace the expensive summation that computes the full...
Main Authors: | , , |
---|---|
Format: | Internet publication |
Language: | English |
Published: |
2020
|