Is SGD a Bayesian sampler? Well, almost
Deep neural networks (DNNs) generalise remarkably well in the overparameterised regime, suggesting a strong inductive bias towards functions with low generalisation error. We empirically investigate this bias by calculating, for a range of architectures and datasets, the probability PSGD(f∣S) that a...
Main Authors: | , , , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
Journal of Machine Learning Research
2021
|