Implicit dynamic regularization in deep networks
Square loss has been observed to perform well in classification tasks, at least as well as crossentropy. However, a theoretical justification is lacking. Here we develop a theoretical analysis for the square loss that also complements the existing asymptotic analysis for the exponential loss.
Main Authors: | , |
---|---|
Format: | Technical Report |
Published: |
Center for Brains, Minds and Machines (CBMM)
2020
|
Online Access: | https://hdl.handle.net/1721.1/126653 |
Summary: | Square loss has been observed to perform well in classification tasks, at least as well as crossentropy. However, a theoretical justification is lacking. Here we develop a theoretical analysis for the square loss that also complements the existing asymptotic analysis for the exponential loss. |
---|