Using Machine Learning To Predict Suitable Conditions for Organic Reactions

© Copyright 2018 American Chemical Society. Reaction condition recommendation is an essential element for the realization of computer-assisted synthetic planning. Accurate suggestions of reaction conditions are required for experimental validation and can have a significant effect on the success or...

Full description

Bibliographic Details
Main Authors: Gao, Hanyu, Struble, Thomas J, Coley, Connor W, Wang, Yuran, Green, William H, Jensen, Klavs F
Other Authors: Massachusetts Institute of Technology. Department of Chemical Engineering
Format: Article
Language:English
Published: American Chemical Society (ACS) 2021
Online Access:https://hdl.handle.net/1721.1/135864
_version_ 1826205116622962688
author Gao, Hanyu
Struble, Thomas J
Coley, Connor W
Wang, Yuran
Green, William H
Jensen, Klavs F
author2 Massachusetts Institute of Technology. Department of Chemical Engineering
author_facet Massachusetts Institute of Technology. Department of Chemical Engineering
Gao, Hanyu
Struble, Thomas J
Coley, Connor W
Wang, Yuran
Green, William H
Jensen, Klavs F
author_sort Gao, Hanyu
collection MIT
description © Copyright 2018 American Chemical Society. Reaction condition recommendation is an essential element for the realization of computer-assisted synthetic planning. Accurate suggestions of reaction conditions are required for experimental validation and can have a significant effect on the success or failure of an attempted transformation. However, de novo condition recommendation remains a challenging and under-explored problem and relies heavily on chemists' knowledge and experience. In this work, we develop a neural-network model to predict the chemical context (catalyst(s), solvent(s), reagent(s)), as well as the temperature most suitable for any particular organic reaction. Trained on ∼10 million examples from Reaxys, the model is able to propose conditions where a close match to the recorded catalyst, solvent, and reagent is found within the top-10 predictions 69.6% of the time, with top-10 accuracies for individual species reaching 80-90%. Temperature is accurately predicted within ±20 °C from the recorded temperature in 60-70% of test cases, with higher accuracy for cases with correct chemical context predictions. The utility of the model is illustrated through several examples spanning a range of common reaction classes. We also demonstrate that the model implicitly learns a continuous numerical embedding of solvent and reagent species that captures their functional similarity.
first_indexed 2024-09-23T13:07:19Z
format Article
id mit-1721.1/135864
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T13:07:19Z
publishDate 2021
publisher American Chemical Society (ACS)
record_format dspace
spelling mit-1721.1/1358642023-03-15T20:17:21Z Using Machine Learning To Predict Suitable Conditions for Organic Reactions Gao, Hanyu Struble, Thomas J Coley, Connor W Wang, Yuran Green, William H Jensen, Klavs F Massachusetts Institute of Technology. Department of Chemical Engineering © Copyright 2018 American Chemical Society. Reaction condition recommendation is an essential element for the realization of computer-assisted synthetic planning. Accurate suggestions of reaction conditions are required for experimental validation and can have a significant effect on the success or failure of an attempted transformation. However, de novo condition recommendation remains a challenging and under-explored problem and relies heavily on chemists' knowledge and experience. In this work, we develop a neural-network model to predict the chemical context (catalyst(s), solvent(s), reagent(s)), as well as the temperature most suitable for any particular organic reaction. Trained on ∼10 million examples from Reaxys, the model is able to propose conditions where a close match to the recorded catalyst, solvent, and reagent is found within the top-10 predictions 69.6% of the time, with top-10 accuracies for individual species reaching 80-90%. Temperature is accurately predicted within ±20 °C from the recorded temperature in 60-70% of test cases, with higher accuracy for cases with correct chemical context predictions. The utility of the model is illustrated through several examples spanning a range of common reaction classes. We also demonstrate that the model implicitly learns a continuous numerical embedding of solvent and reagent species that captures their functional similarity. 2021-10-27T20:29:42Z 2021-10-27T20:29:42Z 2018 2019-08-19T17:44:53Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/135864 en 10.1021/ACSCENTSCI.8B00357 ACS Central Science Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf American Chemical Society (ACS) ACS
spellingShingle Gao, Hanyu
Struble, Thomas J
Coley, Connor W
Wang, Yuran
Green, William H
Jensen, Klavs F
Using Machine Learning To Predict Suitable Conditions for Organic Reactions
title Using Machine Learning To Predict Suitable Conditions for Organic Reactions
title_full Using Machine Learning To Predict Suitable Conditions for Organic Reactions
title_fullStr Using Machine Learning To Predict Suitable Conditions for Organic Reactions
title_full_unstemmed Using Machine Learning To Predict Suitable Conditions for Organic Reactions
title_short Using Machine Learning To Predict Suitable Conditions for Organic Reactions
title_sort using machine learning to predict suitable conditions for organic reactions
url https://hdl.handle.net/1721.1/135864
work_keys_str_mv AT gaohanyu usingmachinelearningtopredictsuitableconditionsfororganicreactions
AT strublethomasj usingmachinelearningtopredictsuitableconditionsfororganicreactions
AT coleyconnorw usingmachinelearningtopredictsuitableconditionsfororganicreactions
AT wangyuran usingmachinelearningtopredictsuitableconditionsfororganicreactions
AT greenwilliamh usingmachinelearningtopredictsuitableconditionsfororganicreactions
AT jensenklavsf usingmachinelearningtopredictsuitableconditionsfororganicreactions