Modeling Intelligence via Graph Neural Networks

Artificial intelligence can be more powerful than human intelligence. Many problems are perhaps challenging from a human perspective. These could be seeking statistical patterns in complex and structured objects, such as drug molecules and the global financial system. Advances in deep learning have...

Full description

Bibliographic Details
Main Author: Xu, Keyulu
Other Authors: Jegelka, Stefanie
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139331
_version_ 1826212833415659520
author Xu, Keyulu
author2 Jegelka, Stefanie
author_facet Jegelka, Stefanie
Xu, Keyulu
author_sort Xu, Keyulu
collection MIT
description Artificial intelligence can be more powerful than human intelligence. Many problems are perhaps challenging from a human perspective. These could be seeking statistical patterns in complex and structured objects, such as drug molecules and the global financial system. Advances in deep learning have shown that the key to solving such tasks is to learn a good representation. Given the representations of the world, the second aspect of intelligence is reasoning. Learning to reason implies learning to implement a correct reasoning process, within and outside the training distribution. In this thesis, we address the fundamental problem of modeling intelligence that can learn to represent and reason about the world. We study both questions from the lens of graph neural networks, a class of neural networks acting on graphs. First, we can abstract many objects in the world as graphs and learn their representations with graph neural networks. Second, we shall see how graph neural networks exploit the algorithmic structure in reasoning processes to improve generalization. This thesis consists of four parts. Each part also studies one aspect of the theoretical landscape of learning: the representation power, generalization, extrapolation, and optimization. In Part I, we characterize the expressive power of graph neural networks for representing graphs, and build maximally powerful graph neural networks. In Part II, we analyze generalization and show implications for what reasoning a neural network can sample-efficiently learn. Our analysis takes into account the training algorithm, the network structure, and the task structure. In Part III, we study how neural networks extrapolate and under what conditions they learn the correct reasoning outside the training distribution. In Part IV, we prove global convergence rates and develop normalization methods that accelerate the training of graph neural networks. Our techniques and insights go beyond graph neural networks, and extend broadly to deep learning models.
first_indexed 2024-09-23T15:38:57Z
format Thesis
id mit-1721.1/139331
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T15:38:57Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1393312022-01-15T03:08:39Z Modeling Intelligence via Graph Neural Networks Xu, Keyulu Jegelka, Stefanie Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Artificial intelligence can be more powerful than human intelligence. Many problems are perhaps challenging from a human perspective. These could be seeking statistical patterns in complex and structured objects, such as drug molecules and the global financial system. Advances in deep learning have shown that the key to solving such tasks is to learn a good representation. Given the representations of the world, the second aspect of intelligence is reasoning. Learning to reason implies learning to implement a correct reasoning process, within and outside the training distribution. In this thesis, we address the fundamental problem of modeling intelligence that can learn to represent and reason about the world. We study both questions from the lens of graph neural networks, a class of neural networks acting on graphs. First, we can abstract many objects in the world as graphs and learn their representations with graph neural networks. Second, we shall see how graph neural networks exploit the algorithmic structure in reasoning processes to improve generalization. This thesis consists of four parts. Each part also studies one aspect of the theoretical landscape of learning: the representation power, generalization, extrapolation, and optimization. In Part I, we characterize the expressive power of graph neural networks for representing graphs, and build maximally powerful graph neural networks. In Part II, we analyze generalization and show implications for what reasoning a neural network can sample-efficiently learn. Our analysis takes into account the training algorithm, the network structure, and the task structure. In Part III, we study how neural networks extrapolate and under what conditions they learn the correct reasoning outside the training distribution. In Part IV, we prove global convergence rates and develop normalization methods that accelerate the training of graph neural networks. Our techniques and insights go beyond graph neural networks, and extend broadly to deep learning models. Ph.D. 2022-01-14T15:04:32Z 2022-01-14T15:04:32Z 2021-06 2021-06-23T19:41:02.425Z Thesis https://hdl.handle.net/1721.1/139331 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Xu, Keyulu
Modeling Intelligence via Graph Neural Networks
title Modeling Intelligence via Graph Neural Networks
title_full Modeling Intelligence via Graph Neural Networks
title_fullStr Modeling Intelligence via Graph Neural Networks
title_full_unstemmed Modeling Intelligence via Graph Neural Networks
title_short Modeling Intelligence via Graph Neural Networks
title_sort modeling intelligence via graph neural networks
url https://hdl.handle.net/1721.1/139331
work_keys_str_mv AT xukeyulu modelingintelligenceviagraphneuralnetworks