Classical generalization bounds are surprisingly tight for Deep Networks

Deep networks are usually trained and tested in a regime in which the training classification error is not a good predictor of the test error. Thus the consensus has been that generalization, defined as convergence of the empirical to the expected error, does not hold for deep networks. Here we show...

全面介绍

书目详细资料
Main Authors: Liao, Qianli, Miranda, Brando, Hidary, Jack, Poggio, Tomaso
格式: Technical Report
语言:en_US
出版: Center for Brains, Minds and Machines (CBMM) 2018
在线阅读:http://hdl.handle.net/1721.1/116911
实物特征
总结:Deep networks are usually trained and tested in a regime in which the training classification error is not a good predictor of the test error. Thus the consensus has been that generalization, defined as convergence of the empirical to the expected error, does not hold for deep networks. Here we show that, when normalized appropriately after training, deep networks trained on exponential type losses show a good linear dependence of test loss on training loss. The observation, motivated by a previous theoretical analysis of overparameterization and overfitting, not only demonstrates the validity of classical generalization bounds for deep learning but suggests that they are tight. In addition, we also show that the bound of the classification error by the normalized cross entropy loss is empirically rather tight on the data sets we studied.