Classical generalization bounds are surprisingly tight for Deep Networks

Deep networks are usually trained and tested in a regime in which the training classification error is not a good predictor of the test error. Thus the consensus has been that generalization, defined as convergence of the empirical to the expected error, does not hold for deep networks. Here we show...

Full description

Bibliographic Details
Main Authors:	Liao, Qianli, Miranda, Brando, Hidary, Jack, Poggio, Tomaso
Format:	Technical Report
Language:	en_US
Published:	Center for Brains, Minds and Machines (CBMM) 2018
Online Access:	http://hdl.handle.net/1721.1/116911

_version_	1811087881974841344
author	Liao, Qianli Miranda, Brando Hidary, Jack Poggio, Tomaso
author_facet	Liao, Qianli Miranda, Brando Hidary, Jack Poggio, Tomaso
author_sort	Liao, Qianli
collection	MIT
description	Deep networks are usually trained and tested in a regime in which the training classification error is not a good predictor of the test error. Thus the consensus has been that generalization, defined as convergence of the empirical to the expected error, does not hold for deep networks. Here we show that, when normalized appropriately after training, deep networks trained on exponential type losses show a good linear dependence of test loss on training loss. The observation, motivated by a previous theoretical analysis of overparameterization and overfitting, not only demonstrates the validity of classical generalization bounds for deep learning but suggests that they are tight. In addition, we also show that the bound of the classification error by the normalized cross entropy loss is empirically rather tight on the data sets we studied.
first_indexed	2024-09-23T13:53:23Z
format	Technical Report
id	mit-1721.1/116911
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T13:53:23Z
publishDate	2018
publisher	Center for Brains, Minds and Machines (CBMM)
record_format	dspace
spelling	mit-1721.1/1169112019-09-12T16:17:54Z Classical generalization bounds are surprisingly tight for Deep Networks Liao, Qianli Miranda, Brando Hidary, Jack Poggio, Tomaso Deep networks are usually trained and tested in a regime in which the training classification error is not a good predictor of the test error. Thus the consensus has been that generalization, defined as convergence of the empirical to the expected error, does not hold for deep networks. Here we show that, when normalized appropriately after training, deep networks trained on exponential type losses show a good linear dependence of test loss on training loss. The observation, motivated by a previous theoretical analysis of overparameterization and overfitting, not only demonstrates the validity of classical generalization bounds for deep learning but suggests that they are tight. In addition, we also show that the bound of the classification error by the normalized cross entropy loss is empirically rather tight on the data sets we studied. This material is based upon work supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216. 2018-07-11T18:15:36Z 2018-07-11T18:15:36Z 2018-07-11 Technical Report Working Paper Other http://hdl.handle.net/1721.1/116911 en_US CBMM Memo Series;091 application/pdf Center for Brains, Minds and Machines (CBMM)
spellingShingle	Liao, Qianli Miranda, Brando Hidary, Jack Poggio, Tomaso Classical generalization bounds are surprisingly tight for Deep Networks
title	Classical generalization bounds are surprisingly tight for Deep Networks
title_full	Classical generalization bounds are surprisingly tight for Deep Networks
title_fullStr	Classical generalization bounds are surprisingly tight for Deep Networks
title_full_unstemmed	Classical generalization bounds are surprisingly tight for Deep Networks
title_short	Classical generalization bounds are surprisingly tight for Deep Networks
title_sort	classical generalization bounds are surprisingly tight for deep networks
url	http://hdl.handle.net/1721.1/116911
work_keys_str_mv	AT liaoqianli classicalgeneralizationboundsaresurprisinglytightfordeepnetworks AT mirandabrando classicalgeneralizationboundsaresurprisinglytightfordeepnetworks AT hidaryjack classicalgeneralizationboundsaresurprisinglytightfordeepnetworks AT poggiotomaso classicalgeneralizationboundsaresurprisinglytightfordeepnetworks

Classical generalization bounds are surprisingly tight for Deep Networks

Similar Items