Generalisation and expressiveness for over-parameterised neural networks

<p>Over-parameterised modern neural networks owe their success to two fundamental properties: expressive power and generalisation capability. The former refers to the model's ability to fit a large variety of data sets, while the latter enables the network to extrapolate patterns from tra...

Full beskrivning

Bibliografiska uppgifter
Huvudupphovsman: Clerico, E
Övriga upphovsmän: Deligiannidis, G
Materialtyp: Lärdomsprov
Språk:English
Publicerad: 2023
Ämnen:
_version_ 1826312674917482496
author Clerico, E
author2 Deligiannidis, G
author_facet Deligiannidis, G
Clerico, E
author_sort Clerico, E
collection OXFORD
description <p>Over-parameterised modern neural networks owe their success to two fundamental properties: expressive power and generalisation capability. The former refers to the model's ability to fit a large variety of data sets, while the latter enables the network to extrapolate patterns from training examples and apply them to previously unseen data. This thesis addresses a few challenges related to these two key properties.</p> <p>The fact that over-parameterised networks can fit any data set is not always indicative of their practical expressiveness. This is the object of the first part of this thesis, where we delve into how the input information can get lost when propagating through a deep architecture, and we propose as an easily implementable possible solution the introduction of suitable scaling factors and residual connections. </p> <p>The second part of this thesis focuses on generalisation. The reason why modern neural networks can generalise well to new data without overfitting, despite being over-parameterised, is an open question that is currently receiving considerable attention in the research community. We explore this subject from information-theoretic and PAC-Bayesian viewpoints, proposing novel learning algorithms and generalisation bounds.</p>
first_indexed 2024-04-09T03:59:04Z
format Thesis
id oxford-uuid:89c0873b-0621-4960-bba4-25a69a585e65
institution University of Oxford
language English
last_indexed 2024-04-09T03:59:04Z
publishDate 2023
record_format dspace
spelling oxford-uuid:89c0873b-0621-4960-bba4-25a69a585e652024-04-05T16:35:06ZGeneralisation and expressiveness for over-parameterised neural networksThesishttp://purl.org/coar/resource_type/c_db06uuid:89c0873b-0621-4960-bba4-25a69a585e65Machine learningStatistical learning theoryEnglishHyrax Deposit2023Clerico, EDeligiannidis, GDoucet, AHayou, SHe, BShidani, AGuedj, BFarghly, TRousseau, J<p>Over-parameterised modern neural networks owe their success to two fundamental properties: expressive power and generalisation capability. The former refers to the model's ability to fit a large variety of data sets, while the latter enables the network to extrapolate patterns from training examples and apply them to previously unseen data. This thesis addresses a few challenges related to these two key properties.</p> <p>The fact that over-parameterised networks can fit any data set is not always indicative of their practical expressiveness. This is the object of the first part of this thesis, where we delve into how the input information can get lost when propagating through a deep architecture, and we propose as an easily implementable possible solution the introduction of suitable scaling factors and residual connections. </p> <p>The second part of this thesis focuses on generalisation. The reason why modern neural networks can generalise well to new data without overfitting, despite being over-parameterised, is an open question that is currently receiving considerable attention in the research community. We explore this subject from information-theoretic and PAC-Bayesian viewpoints, proposing novel learning algorithms and generalisation bounds.</p>
spellingShingle Machine learning
Statistical learning theory
Clerico, E
Generalisation and expressiveness for over-parameterised neural networks
title Generalisation and expressiveness for over-parameterised neural networks
title_full Generalisation and expressiveness for over-parameterised neural networks
title_fullStr Generalisation and expressiveness for over-parameterised neural networks
title_full_unstemmed Generalisation and expressiveness for over-parameterised neural networks
title_short Generalisation and expressiveness for over-parameterised neural networks
title_sort generalisation and expressiveness for over parameterised neural networks
topic Machine learning
Statistical learning theory
work_keys_str_mv AT clericoe generalisationandexpressivenessforoverparameterisedneuralnetworks