Generalisation and expressiveness for over-parameterised neural networks

<p>Over-parameterised modern neural networks owe their success to two fundamental properties: expressive power and generalisation capability. The former refers to the model's ability to fit a large variety of data sets, while the latter enables the network to extrapolate patterns from tra...

Full beskrivning

Bibliografiska uppgifter
Huvudupphovsman:	Clerico, E
Övriga upphovsmän:	Deligiannidis, G
Materialtyp:	Lärdomsprov
Språk:	English
Publicerad:	2023
Ämnen:	Machine learning Statistical learning theory

_version_	1826312674917482496
author	Clerico, E
author2	Deligiannidis, G
author_facet	Deligiannidis, G Clerico, E
author_sort	Clerico, E
collection	OXFORD
description	<p>Over-parameterised modern neural networks owe their success to two fundamental properties: expressive power and generalisation capability. The former refers to the model's ability to fit a large variety of data sets, while the latter enables the network to extrapolate patterns from training examples and apply them to previously unseen data. This thesis addresses a few challenges related to these two key properties.</p> <p>The fact that over-parameterised networks can fit any data set is not always indicative of their practical expressiveness. This is the object of the first part of this thesis, where we delve into how the input information can get lost when propagating through a deep architecture, and we propose as an easily implementable possible solution the introduction of suitable scaling factors and residual connections. </p> <p>The second part of this thesis focuses on generalisation. The reason why modern neural networks can generalise well to new data without overfitting, despite being over-parameterised, is an open question that is currently receiving considerable attention in the research community. We explore this subject from information-theoretic and PAC-Bayesian viewpoints, proposing novel learning algorithms and generalisation bounds.</p>
first_indexed	2024-04-09T03:59:04Z
format	Thesis
id	oxford-uuid:89c0873b-0621-4960-bba4-25a69a585e65
institution	University of Oxford
language	English
last_indexed	2024-04-09T03:59:04Z
publishDate	2023
record_format	dspace
spelling	oxford-uuid:89c0873b-0621-4960-bba4-25a69a585e652024-04-05T16:35:06ZGeneralisation and expressiveness for over-parameterised neural networksThesishttp://purl.org/coar/resource_type/c_db06uuid:89c0873b-0621-4960-bba4-25a69a585e65Machine learningStatistical learning theoryEnglishHyrax Deposit2023Clerico, EDeligiannidis, GDoucet, AHayou, SHe, BShidani, AGuedj, BFarghly, TRousseau, J<p>Over-parameterised modern neural networks owe their success to two fundamental properties: expressive power and generalisation capability. The former refers to the model's ability to fit a large variety of data sets, while the latter enables the network to extrapolate patterns from training examples and apply them to previously unseen data. This thesis addresses a few challenges related to these two key properties.</p> <p>The fact that over-parameterised networks can fit any data set is not always indicative of their practical expressiveness. This is the object of the first part of this thesis, where we delve into how the input information can get lost when propagating through a deep architecture, and we propose as an easily implementable possible solution the introduction of suitable scaling factors and residual connections. </p> <p>The second part of this thesis focuses on generalisation. The reason why modern neural networks can generalise well to new data without overfitting, despite being over-parameterised, is an open question that is currently receiving considerable attention in the research community. We explore this subject from information-theoretic and PAC-Bayesian viewpoints, proposing novel learning algorithms and generalisation bounds.</p>
spellingShingle	Machine learning Statistical learning theory Clerico, E Generalisation and expressiveness for over-parameterised neural networks
title	Generalisation and expressiveness for over-parameterised neural networks
title_full	Generalisation and expressiveness for over-parameterised neural networks
title_fullStr	Generalisation and expressiveness for over-parameterised neural networks
title_full_unstemmed	Generalisation and expressiveness for over-parameterised neural networks
title_short	Generalisation and expressiveness for over-parameterised neural networks
title_sort	generalisation and expressiveness for over parameterised neural networks
topic	Machine learning Statistical learning theory
work_keys_str_mv	AT clericoe generalisationandexpressivenessforoverparameterisedneuralnetworks

Generalisation and expressiveness for over-parameterised neural networks

Liknande verk