Prediction of Organic Reaction Outcomes Using Machine Learning

Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions is that proposed reaction steps often fail when attempted in the laboratory,...

Full description

Bibliographic Details
Main Authors:	Coley, Connor W., Barzilay, Regina, Jaakkola, Tommi S., Green, William H., Jensen, Klavs F., Coley, Connor Wilson
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	American Chemical Society (ACS) 2017
Online Access:	http://hdl.handle.net/1721.1/110706 https://orcid.org/0000-0002-8271-8723 https://orcid.org/0000-0002-2921-8201 https://orcid.org/0000-0002-2199-0379 https://orcid.org/0000-0001-7192-580X

_version_	1826198237919313920
author	Coley, Connor W. Barzilay, Regina Jaakkola, Tommi S. Green, William H. Jensen, Klavs F. Coley, Connor Wilson Jaakkola, Tommi S. Green, William H. Jensen, Klavs F.
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Coley, Connor W. Barzilay, Regina Jaakkola, Tommi S. Green, William H. Jensen, Klavs F. Coley, Connor Wilson Jaakkola, Tommi S. Green, William H. Jensen, Klavs F.
author_sort	Coley, Connor W.
collection	MIT
description	Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions is that proposed reaction steps often fail when attempted in the laboratory, despite initially seeming viable. The true measure of success for any synthesis program is whether the predicted outcome matches what is observed experimentally. We report a model framework for anticipating reaction outcomes that combines the traditional use of reaction templates with the flexibility in pattern recognition afforded by neural networks. Using 15 000 experimental reaction records from granted United States patents, a model is trained to select the major (recorded) product by ranking a self-generated list of candidates where one candidate is known to be the major product. Candidate reactions are represented using a unique edit-based representation that emphasizes the fundamental transformation from reactants to products, rather than the constituent molecules’ overall structures. In a 5-fold cross-validation, the trained model assigns the major product rank 1 in 71.8% of cases, rank ≤3 in 86.7% of cases, and rank ≤5 in 90.8% of cases.
first_indexed	2024-09-23T11:01:57Z
format	Article
id	mit-1721.1/110706
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T11:01:57Z
publishDate	2017
publisher	American Chemical Society (ACS)
record_format	dspace
spelling	mit-1721.1/1107062022-09-27T16:39:17Z Prediction of Organic Reaction Outcomes Using Machine Learning Coley, Connor W. Barzilay, Regina Jaakkola, Tommi S. Green, William H. Jensen, Klavs F. Coley, Connor Wilson Jaakkola, Tommi S. Green, William H. Jensen, Klavs F. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Chemical Engineering Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Coley, Connor Wilson Barzilay, Regina Jaakkola, Tommi S. Green, William H. Jensen, Klavs F. Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions is that proposed reaction steps often fail when attempted in the laboratory, despite initially seeming viable. The true measure of success for any synthesis program is whether the predicted outcome matches what is observed experimentally. We report a model framework for anticipating reaction outcomes that combines the traditional use of reaction templates with the flexibility in pattern recognition afforded by neural networks. Using 15 000 experimental reaction records from granted United States patents, a model is trained to select the major (recorded) product by ranking a self-generated list of candidates where one candidate is known to be the major product. Candidate reactions are represented using a unique edit-based representation that emphasizes the fundamental transformation from reactants to products, rather than the constituent molecules’ overall structures. In a 5-fold cross-validation, the trained model assigns the major product rank 1 in 71.8% of cases, rank ≤3 in 86.7% of cases, and rank ≤5 in 90.8% of cases. United States. Defense Advanced Research Projects Agency (ARO W911NF-16-2-0023) National Science Foundation (U.S.) (1122374) 2017-07-14T18:41:19Z 2017-07-14T18:41:19Z 2017-04 2017-02 Article http://purl.org/eprint/type/JournalArticle 2374-7943 2374-7951 http://hdl.handle.net/1721.1/110706 Coley, Connor W.; Barzilay, Regina; Jaakkola, Tommi S. et al. “Prediction of Organic Reaction Outcomes Using Machine Learning.” ACS Central Science 3, 5 (April 2017): 434–443 © 2017 American Chemical Society https://orcid.org/0000-0002-8271-8723 https://orcid.org/0000-0002-2921-8201 https://orcid.org/0000-0002-2199-0379 https://orcid.org/0000-0001-7192-580X en_US http://dx.doi.org/10.1021/acscentsci.7b00064 ACS Central Science Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf American Chemical Society (ACS) ACS
spellingShingle	Coley, Connor W. Barzilay, Regina Jaakkola, Tommi S. Green, William H. Jensen, Klavs F. Coley, Connor Wilson Jaakkola, Tommi S. Green, William H. Jensen, Klavs F. Prediction of Organic Reaction Outcomes Using Machine Learning
title	Prediction of Organic Reaction Outcomes Using Machine Learning
title_full	Prediction of Organic Reaction Outcomes Using Machine Learning
title_fullStr	Prediction of Organic Reaction Outcomes Using Machine Learning
title_full_unstemmed	Prediction of Organic Reaction Outcomes Using Machine Learning
title_short	Prediction of Organic Reaction Outcomes Using Machine Learning
title_sort	prediction of organic reaction outcomes using machine learning
url	http://hdl.handle.net/1721.1/110706 https://orcid.org/0000-0002-8271-8723 https://orcid.org/0000-0002-2921-8201 https://orcid.org/0000-0002-2199-0379 https://orcid.org/0000-0001-7192-580X
work_keys_str_mv	AT coleyconnorw predictionoforganicreactionoutcomesusingmachinelearning AT barzilayregina predictionoforganicreactionoutcomesusingmachinelearning AT jaakkolatommis predictionoforganicreactionoutcomesusingmachinelearning AT greenwilliamh predictionoforganicreactionoutcomesusingmachinelearning AT jensenklavsf predictionoforganicreactionoutcomesusingmachinelearning AT coleyconnorwilson predictionoforganicreactionoutcomesusingmachinelearning AT jaakkolatommis predictionoforganicreactionoutcomesusingmachinelearning AT greenwilliamh predictionoforganicreactionoutcomesusingmachinelearning AT jensenklavsf predictionoforganicreactionoutcomesusingmachinelearning

Prediction of Organic Reaction Outcomes Using Machine Learning

Similar Items