Musings on Deep Learning: Properties of SGD

Musings on Deep Learning: Properties of SGD

[previously titled "Theory of Deep Learning III: Generalization Properties of SGD"] In Theory III we characterize with a mix of theory and experiments the generalization properties of Stochastic Gradient Descent in overparametrized deep convolutional networks. We show that Stochastic Gradi...

Full description

Bibliographic Details
Main Authors:	Zhang, Chiyuan, Liao, Qianli, Rakhlin, Alexander, Sridharan, Karthik, Miranda, Brando, Golowich, Noah, Poggio, Tomaso
Format:	Technical Report
Language:	en_US
Published:	Center for Brains, Minds and Machines (CBMM) 2017
Online Access:	http://hdl.handle.net/1721.1/107841

Similar Items

Theory of Deep Learning IIb: Optimization Properties of SGD
by: Zhang, Chiyuan, et al.
Published: (2018)

Classical generalization bounds are surprisingly tight for Deep Networks
by: Liao, Qianli, et al.
Published: (2018)

SGD Noise and Implicit Low-Rank Bias in Deep Neural Networks
by: Galanti, Tomer, et al.
Published: (2022)

Theory IIIb: Generalization in Deep Networks
by: Poggio, Tomaso, et al.
Published: (2018)

Theory I: Why and When Can Deep Networks Avoid the Curse of Dimensionality?
by: Poggio, Tomaso, et al.
Published: (2016)

Object-Oriented Deep Learning
by: Liao, Qianli, et al.
Published: (2017)

Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review
by: Mhaskar, Hrushikesh, et al.
Published: (2017)

Theory II: Landscape of the Empirical Risk in Deep Learning
by: Poggio, Tomaso, et al.
Published: (2017)

SGD and Weight Decay Provably Induce a Low-Rank Bias in Deep Neural Networks
by: Galanti, Tomer, et al.
Published: (2023)

Theory of Deep Learning III: explaining the non-overfitting puzzle
by: Poggio, Tomaso, et al.
Published: (2018)

Loss landscape: SGD can have a better view than GD
by: Poggio, Tomaso, et al.
Published: (2020)

Implicit dynamic regularization in deep networks
by: Poggio, Tomaso, et al.
Published: (2020)

Learning Real and Boolean Functions: When Is Deep Better Than Shallow
by: Mhaskar, Hrushikesh, et al.
Published: (2016)

The Janus effects of SGD vs GD: high noise and low rank
by: Xu, Mengjia, et al.
Published: (2023)

Representations That Learn vs. Learning Representations
by: Liao, Qianli, et al.
Published: (2018)

Theoretical issues in deep networks
by: Poggio, Tomaso, et al.
Published: (2021)

Theoretical Issues in Deep Networks
by: Poggio, Tomaso, et al.
Published: (2019)

Complexity control by gradient descent in deep networks
by: Poggio, Tomaso, et al.
Published: (2021)

Human-like Learning: A Research Proposal
by: Liao, Qianli, et al.
Published: (2017)

Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex
by: Liao, Qianli, et al.
Published: (2016)

3D Object-Oriented Learning: An End-to-end Transformation-Disentangled 3D Representation
by: Liao, Qianli, et al.
Published: (2018)

Complexity control by gradient descent in deep networks
by: Poggio, Tomaso A, et al.
Published: (2022)

Hierarchically Local Tasks and Deep Convolutional Networks
by: Deza, Arturo, et al.
Published: (2020)

Size-independent sample complexity of neural networks
by: Golowich, Noah, et al.
Published: (2021)

Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning
by: Liao, Qianli, et al.
Published: (2016)

Automatic billing counterfeit detection for SGD money
by: Arun Ramchandani.
Published: (2010)

A Deep Representation for Invariance And Music Classification
by: Zhang, Chiyuan, et al.
Published: (2015)

When Is Handcrafting Not a Curse?
by: Liao, Qianli, et al.
Published: (2018)

Exact Equivariance, Disentanglement and Invariance of Transformations
by: Liao, Qianli, et al.
Published: (2018)

Biologically-plausible learning algorithms can scale to large datasets
by: Xiao, Will, et al.
Published: (2019)

Biologically-Plausible Learning Algorithms Can Scale to Large Datasets
by: Xiao, Will, et al.
Published: (2018)

Unsupervised learning of clutter-resistant visual representations from natural videos
by: Liao, Qianli, et al.
Published: (2015)

Random shuffling beats SGD after finite epochs
by: HaoChen, Jeff, et al.
Published: (2021)

Learning invariant representations and applications to face verification
by: Liao, Qianli, et al.
Published: (2014)

Towards more biologically plausible deep learning and visual processing
by: Liao, Qianli
Published: (2017)

A deep representation for invariance and music classification
by: Zhang, Chiyuan, et al.
Published: (2016)

The friendly muse : ADM student wellbeing
by: Low, Eudora Yu Lin
Published: (2017)

SGML functions for AthenaMuse 2
by: Gentry, James C. (James Carl)
Published: (2007)

Learning An Invariant Speech Representation
by: Evangelopoulos, Georgios, et al.
Published: (2015)

Spatial IQ Test for AI
by: Hilton, Erwin, et al.
Published: (2018)