Molecular Graph Representation Learning and Generation for Drug Discovery

Machine learning methods have been widely pervasive in the domain of drug discovery, enabling more powerful and efficient models. Before deep models, modeling molecules was largely driven by expert knowledge; and to represent the complexities of the molecular landscape, these hand-engineered rules p...

Full description

Bibliographic Details
Main Author:	Chen, Benson
Other Authors:	Barzilay, Regina
Format:	Thesis
Published:	Massachusetts Institute of Technology 2022
Online Access:	https://hdl.handle.net/1721.1/143362

_version_	1826211928038440960
author	Chen, Benson
author2	Barzilay, Regina
author_facet	Barzilay, Regina Chen, Benson
author_sort	Chen, Benson
collection	MIT
description	Machine learning methods have been widely pervasive in the domain of drug discovery, enabling more powerful and efficient models. Before deep models, modeling molecules was largely driven by expert knowledge; and to represent the complexities of the molecular landscape, these hand-engineered rules prove insufficient. Deep learning models are powerful because they learn the important statistical features of the problem–but only with the correct inductive biases. We tackle this important problem in the context of two molecular problems: representation and generation. Canonical success of deep learning is deeply rooted in its ability to map the input domain into a meaningful representation space. This is especially poignant for molecular problems, where the “right” relations between molecules is nuanced and complex. The first part of this thesis will focus on molecular representation, in particular, property and reaction prediction. Here, we explore a transformer-style architecture for molecular representation, providing new tools to apply these models to graph-structured objects. Moving away from the traditional graph neural network paradigm, we demonstrate the efficacy of prototype networks for molecular representation, which allows us to reason over learned property prototypes of molecules. Lastly, we look at the molecular representations in the context of improving reaction predictions. The second part of this thesis will focus on molecular generation, which is crucial in drug discovery as a means to propose promising drug candidates. Here we develop a new method for multi-property molecule generation, by first learning a distributional vocabulary over molecular fragments. Then, using this vocabulary, we survey efficient exploration methods over the chemical space.
first_indexed	2024-09-23T15:13:37Z
format	Thesis
id	mit-1721.1/143362
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T15:13:37Z
publishDate	2022
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1433622022-06-16T03:10:22Z Molecular Graph Representation Learning and Generation for Drug Discovery Chen, Benson Barzilay, Regina Jaakkola, Tommi Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Machine learning methods have been widely pervasive in the domain of drug discovery, enabling more powerful and efficient models. Before deep models, modeling molecules was largely driven by expert knowledge; and to represent the complexities of the molecular landscape, these hand-engineered rules prove insufficient. Deep learning models are powerful because they learn the important statistical features of the problem–but only with the correct inductive biases. We tackle this important problem in the context of two molecular problems: representation and generation. Canonical success of deep learning is deeply rooted in its ability to map the input domain into a meaningful representation space. This is especially poignant for molecular problems, where the “right” relations between molecules is nuanced and complex. The first part of this thesis will focus on molecular representation, in particular, property and reaction prediction. Here, we explore a transformer-style architecture for molecular representation, providing new tools to apply these models to graph-structured objects. Moving away from the traditional graph neural network paradigm, we demonstrate the efficacy of prototype networks for molecular representation, which allows us to reason over learned property prototypes of molecules. Lastly, we look at the molecular representations in the context of improving reaction predictions. The second part of this thesis will focus on molecular generation, which is crucial in drug discovery as a means to propose promising drug candidates. Here we develop a new method for multi-property molecule generation, by first learning a distributional vocabulary over molecular fragments. Then, using this vocabulary, we survey efficient exploration methods over the chemical space. Ph.D. 2022-06-15T13:15:20Z 2022-06-15T13:15:20Z 2022-02 2022-03-04T20:47:40.273Z Thesis https://hdl.handle.net/1721.1/143362 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Chen, Benson Molecular Graph Representation Learning and Generation for Drug Discovery
title	Molecular Graph Representation Learning and Generation for Drug Discovery
title_full	Molecular Graph Representation Learning and Generation for Drug Discovery
title_fullStr	Molecular Graph Representation Learning and Generation for Drug Discovery
title_full_unstemmed	Molecular Graph Representation Learning and Generation for Drug Discovery
title_short	Molecular Graph Representation Learning and Generation for Drug Discovery
title_sort	molecular graph representation learning and generation for drug discovery
url	https://hdl.handle.net/1721.1/143362
work_keys_str_mv	AT chenbenson moleculargraphrepresentationlearningandgenerationfordrugdiscovery

Molecular Graph Representation Learning and Generation for Drug Discovery

Similar Items