Benign overfitting and noisy features

Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off. This \textit{benign overfitting} phenomenon has recentl...

Full description

Bibliographic Details
Main Authors:	Li, Z, Su, W, Sejdinovic, D
Format:	Journal article
Language:	English
Published:	Taylor and Francis 2022

_version_	1797113069054197760
author	Li, Z Su, W Sejdinovic, D
author_facet	Li, Z Su, W Sejdinovic, D
author_sort	Li, Z
collection	OXFORD
description	Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off. This \textit{benign overfitting} phenomenon has recently been characterized using so called \textit{double descent} curves where the risk undergoes another descent (in addition to the classical U-shaped learning curve when the number of parameters is small) as we increase the number of parameters beyond a certain threshold. In this paper, we examine the conditions under which \textit{Benign Overfitting} occurs in the random feature (RF) models, i.e. in a two-layer neural network with fixed first layer weights. We adopt a new view of random feature and show that \textit{benign overfitting} arises due to the noise which resides in such features (the noise may already be present in the data and propagate to the features or it may be added by the user to the features directly) and plays an important implicit regularization role in the phenomenon.
first_indexed	2024-03-07T07:20:23Z
format	Journal article
id	oxford-uuid:cdf8ecbc-d8a1-433b-9edd-d37043f00abd
institution	University of Oxford
language	English
last_indexed	2024-04-09T03:57:08Z
publishDate	2022
publisher	Taylor and Francis
record_format	dspace
spelling	oxford-uuid:cdf8ecbc-d8a1-433b-9edd-d37043f00abd2024-03-20T09:23:51ZBenign overfitting and noisy featuresJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:cdf8ecbc-d8a1-433b-9edd-d37043f00abdEnglishSymplectic ElementsTaylor and Francis2022Li, ZSu, WSejdinovic, DModern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off. This \textit{benign overfitting} phenomenon has recently been characterized using so called \textit{double descent} curves where the risk undergoes another descent (in addition to the classical U-shaped learning curve when the number of parameters is small) as we increase the number of parameters beyond a certain threshold. In this paper, we examine the conditions under which \textit{Benign Overfitting} occurs in the random feature (RF) models, i.e. in a two-layer neural network with fixed first layer weights. We adopt a new view of random feature and show that \textit{benign overfitting} arises due to the noise which resides in such features (the noise may already be present in the data and propagate to the features or it may be added by the user to the features directly) and plays an important implicit regularization role in the phenomenon.
spellingShingle	Li, Z Su, W Sejdinovic, D Benign overfitting and noisy features
title	Benign overfitting and noisy features
title_full	Benign overfitting and noisy features
title_fullStr	Benign overfitting and noisy features
title_full_unstemmed	Benign overfitting and noisy features
title_short	Benign overfitting and noisy features
title_sort	benign overfitting and noisy features
work_keys_str_mv	AT liz benignoverfittingandnoisyfeatures AT suw benignoverfittingandnoisyfeatures AT sejdinovicd benignoverfittingandnoisyfeatures

Benign overfitting and noisy features

Similar Items