Implicit dynamic regularization in deep networks
Square loss has been observed to perform well in classification tasks, at least as well as crossentropy. However, a theoretical justification is lacking. Here we develop a theoretical analysis for the square loss that also complements the existing asymptotic analysis for the exponential loss.
Main Authors: | Poggio, Tomaso, Liao, Qianli |
---|---|
Format: | Technical Report |
Published: |
Center for Brains, Minds and Machines (CBMM)
2020
|
Online Access: | https://hdl.handle.net/1721.1/126653 |
Similar Items
-
Theoretical Issues in Deep Networks
by: Poggio, Tomaso, et al.
Published: (2019) -
Theoretical issues in deep networks
by: Poggio, Tomaso, et al.
Published: (2021) -
SGD Noise and Implicit Low-Rank Bias in Deep Neural Networks
by: Galanti, Tomer, et al.
Published: (2022) -
Complexity control by gradient descent in deep networks
by: Poggio, Tomaso, et al.
Published: (2021) -
Complexity control by gradient descent in deep networks
by: Tomaso Poggio, et al.
Published: (2020-02-01)