Towards More Generalizable Neural Networks via Modularity

Artificial neural networks have become highly effective at performing specific, challenging tasks by leveraging a large amount of training data. However, they are unable to generalize to diverse, unseen domains without requiring significant retraining. This thesis quantifies the generalization diffi...

Full description

Bibliographic Details
Main Author: Boopathy, Akhilan
Other Authors: Fiete, Ila
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/144929
_version_ 1826196106331029504
author Boopathy, Akhilan
author2 Fiete, Ila
author_facet Fiete, Ila
Boopathy, Akhilan
author_sort Boopathy, Akhilan
collection MIT
description Artificial neural networks have become highly effective at performing specific, challenging tasks by leveraging a large amount of training data. However, they are unable to generalize to diverse, unseen domains without requiring significant retraining. This thesis quantifies the generalization difficulty of a task as the amount of information content in the inductive biases required to solve a task, and demonstrates that generalization difficulty relies crucially on the number of dimensions of generalization. Inspired by the modularity of biological learning systems, this thesis then demonstrates theoretically and empirically that modularity promotes generalization by providing a powerful inductive bias. Finally, the thesis proposes a new challenging spatial navigation benchmark that requires a broad degree of generalization from a small amount of training data. This benchmark is presented as a test of the generalization capability of learning algorithms; based on the results of this thesis, modularity is expected to promote generalization on this benchmark.
first_indexed 2024-09-23T10:21:16Z
format Thesis
id mit-1721.1/144929
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T10:21:16Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1449292022-08-30T03:49:23Z Towards More Generalizable Neural Networks via Modularity Boopathy, Akhilan Fiete, Ila Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Artificial neural networks have become highly effective at performing specific, challenging tasks by leveraging a large amount of training data. However, they are unable to generalize to diverse, unseen domains without requiring significant retraining. This thesis quantifies the generalization difficulty of a task as the amount of information content in the inductive biases required to solve a task, and demonstrates that generalization difficulty relies crucially on the number of dimensions of generalization. Inspired by the modularity of biological learning systems, this thesis then demonstrates theoretically and empirically that modularity promotes generalization by providing a powerful inductive bias. Finally, the thesis proposes a new challenging spatial navigation benchmark that requires a broad degree of generalization from a small amount of training data. This benchmark is presented as a test of the generalization capability of learning algorithms; based on the results of this thesis, modularity is expected to promote generalization on this benchmark. S.M. 2022-08-29T16:21:38Z 2022-08-29T16:21:38Z 2022-05 2022-06-21T19:25:43.757Z Thesis https://hdl.handle.net/1721.1/144929 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Boopathy, Akhilan
Towards More Generalizable Neural Networks via Modularity
title Towards More Generalizable Neural Networks via Modularity
title_full Towards More Generalizable Neural Networks via Modularity
title_fullStr Towards More Generalizable Neural Networks via Modularity
title_full_unstemmed Towards More Generalizable Neural Networks via Modularity
title_short Towards More Generalizable Neural Networks via Modularity
title_sort towards more generalizable neural networks via modularity
url https://hdl.handle.net/1721.1/144929
work_keys_str_mv AT boopathyakhilan towardsmoregeneralizableneuralnetworksviamodularity