Analyzing Learned Molecular Representations for Property Prediction

© 2019 American Chemical Society. Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted des...

Full description

Bibliographic Details
Main Authors:	Yang, Kevin, Swanson, Kyle, Jin, Wengong, Coley, Connor, Eiden, Philipp, Gao, Hua, Guzman-Perez, Angel, Hopper, Timothy, Kelley, Brian, Mathea, Miriam, Palmer, Andrew, Settels, Volker, Jaakkola, Tommi, Jensen, Klavs, Barzilay, Regina
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	English
Published:	American Chemical Society (ACS) 2021
Online Access:	https://hdl.handle.net/1721.1/134630

_version_	1826210659340124160
author	Yang, Kevin Swanson, Kyle Jin, Wengong Coley, Connor Eiden, Philipp Gao, Hua Guzman-Perez, Angel Hopper, Timothy Kelley, Brian Mathea, Miriam Palmer, Andrew Settels, Volker Jaakkola, Tommi Jensen, Klavs Barzilay, Regina
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Yang, Kevin Swanson, Kyle Jin, Wengong Coley, Connor Eiden, Philipp Gao, Hua Guzman-Perez, Angel Hopper, Timothy Kelley, Brian Mathea, Miriam Palmer, Andrew Settels, Volker Jaakkola, Tommi Jensen, Klavs Barzilay, Regina
author_sort	Yang, Kevin
collection	MIT
description	© 2019 American Chemical Society. Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 16 proprietary industrial data sets spanning a wide variety of chemical end points. In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary data sets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows.
first_indexed	2024-09-23T14:53:08Z
format	Article
id	mit-1721.1/134630
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T14:53:08Z
publishDate	2021
publisher	American Chemical Society (ACS)
record_format	dspace
spelling	mit-1721.1/1346302023-09-19T18:37:29Z Analyzing Learned Molecular Representations for Property Prediction Yang, Kevin Swanson, Kyle Jin, Wengong Coley, Connor Eiden, Philipp Gao, Hua Guzman-Perez, Angel Hopper, Timothy Kelley, Brian Mathea, Miriam Palmer, Andrew Settels, Volker Jaakkola, Tommi Jensen, Klavs Barzilay, Regina Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory © 2019 American Chemical Society. Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 16 proprietary industrial data sets spanning a wide variety of chemical end points. In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary data sets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows. 2021-10-27T20:05:52Z 2021-10-27T20:05:52Z 2019 2019-08-22T13:08:28Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/134630 en 10.1021/acs.jcim.9b00237 Journal of Chemical Information and Modeling Creative Commons Attribution-NonCommercial-NoDerivs License http://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf American Chemical Society (ACS) ACS
spellingShingle	Yang, Kevin Swanson, Kyle Jin, Wengong Coley, Connor Eiden, Philipp Gao, Hua Guzman-Perez, Angel Hopper, Timothy Kelley, Brian Mathea, Miriam Palmer, Andrew Settels, Volker Jaakkola, Tommi Jensen, Klavs Barzilay, Regina Analyzing Learned Molecular Representations for Property Prediction
title	Analyzing Learned Molecular Representations for Property Prediction
title_full	Analyzing Learned Molecular Representations for Property Prediction
title_fullStr	Analyzing Learned Molecular Representations for Property Prediction
title_full_unstemmed	Analyzing Learned Molecular Representations for Property Prediction
title_short	Analyzing Learned Molecular Representations for Property Prediction
title_sort	analyzing learned molecular representations for property prediction
url	https://hdl.handle.net/1721.1/134630
work_keys_str_mv	AT yangkevin analyzinglearnedmolecularrepresentationsforpropertyprediction AT swansonkyle analyzinglearnedmolecularrepresentationsforpropertyprediction AT jinwengong analyzinglearnedmolecularrepresentationsforpropertyprediction AT coleyconnor analyzinglearnedmolecularrepresentationsforpropertyprediction AT eidenphilipp analyzinglearnedmolecularrepresentationsforpropertyprediction AT gaohua analyzinglearnedmolecularrepresentationsforpropertyprediction AT guzmanperezangel analyzinglearnedmolecularrepresentationsforpropertyprediction AT hoppertimothy analyzinglearnedmolecularrepresentationsforpropertyprediction AT kelleybrian analyzinglearnedmolecularrepresentationsforpropertyprediction AT matheamiriam analyzinglearnedmolecularrepresentationsforpropertyprediction AT palmerandrew analyzinglearnedmolecularrepresentationsforpropertyprediction AT settelsvolker analyzinglearnedmolecularrepresentationsforpropertyprediction AT jaakkolatommi analyzinglearnedmolecularrepresentationsforpropertyprediction AT jensenklavs analyzinglearnedmolecularrepresentationsforpropertyprediction AT barzilayregina analyzinglearnedmolecularrepresentationsforpropertyprediction

Analyzing Learned Molecular Representations for Property Prediction

Similar Items