Multitask prediction of site selectivity in aromatic C-H functionalization reactions

Aromatic C–H functionalization reactions are an important part of the synthetic chemistry toolbox. Accurate prediction of site selectivity can be crucial for prioritizing target compounds and synthetic routes in both drug discovery and process chemistry. However, selectivity may be highly dependent...

Full description

Bibliographic Details
Main Authors: Struble, Thomas J, Coley, Connor Wilson, Jensen, Klavs F
Other Authors: Massachusetts Institute of Technology. Department of Chemical Engineering
Format: Article
Published: 2020
Online Access:https://hdl.handle.net/1721.1/125612
_version_ 1811096475257536512
author Struble, Thomas J
Coley, Connor Wilson
Jensen, Klavs F
author2 Massachusetts Institute of Technology. Department of Chemical Engineering
author_facet Massachusetts Institute of Technology. Department of Chemical Engineering
Struble, Thomas J
Coley, Connor Wilson
Jensen, Klavs F
author_sort Struble, Thomas J
collection MIT
description Aromatic C–H functionalization reactions are an important part of the synthetic chemistry toolbox. Accurate prediction of site selectivity can be crucial for prioritizing target compounds and synthetic routes in both drug discovery and process chemistry. However, selectivity may be highly dependent on subtle electronic and steric features of the substrate. We report a generalizable approach to prediction of site selectivity that is accomplished using a graph-convolutional neural network for the multitask prediction of 123 C–H functionalization tasks. In an 80/10/10 training/validation/testing pseudo-time split of about 58 000 aromatic C–H functionalization reactions from the Reaxys database, the model achieves a mean reciprocal rank of 92%. Once trained, inference requires approximately 200 ms per compound to provide quantitative likelihood scores for each task. This approach and model allow a chemist to quickly determine which C–H functionalization reactions – if any – might proceed with high selectivity.
first_indexed 2024-09-23T16:44:16Z
format Article
id mit-1721.1/125612
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T16:44:16Z
publishDate 2020
record_format dspace
spelling mit-1721.1/1256122022-09-29T21:09:33Z Multitask prediction of site selectivity in aromatic C-H functionalization reactions Struble, Thomas J Coley, Connor Wilson Jensen, Klavs F Massachusetts Institute of Technology. Department of Chemical Engineering Aromatic C–H functionalization reactions are an important part of the synthetic chemistry toolbox. Accurate prediction of site selectivity can be crucial for prioritizing target compounds and synthetic routes in both drug discovery and process chemistry. However, selectivity may be highly dependent on subtle electronic and steric features of the substrate. We report a generalizable approach to prediction of site selectivity that is accomplished using a graph-convolutional neural network for the multitask prediction of 123 C–H functionalization tasks. In an 80/10/10 training/validation/testing pseudo-time split of about 58 000 aromatic C–H functionalization reactions from the Reaxys database, the model achieves a mean reciprocal rank of 92%. Once trained, inference requires approximately 200 ms per compound to provide quantitative likelihood scores for each task. This approach and model allow a chemist to quickly determine which C–H functionalization reactions – if any – might proceed with high selectivity. 2020-06-02T16:49:18Z 2020-06-02T16:49:18Z 2020-04 2020-02 Article http://purl.org/eprint/type/JournalArticle 2058-9883 https://hdl.handle.net/1721.1/125612 Struble, Thomas J., Connor Wilson Coley, and Klavs F. Jensen, "Multitask prediction of site selectivity in aromatic C-H functionalization reactions." Reaction Chemistry & Engineering 5 (Apr. 2020): no. 896 doi 10.1039/D0RE00071J ©2020 Author(s) 10.1039/D0RE00071J Reaction Chemistry & Engineering Creative Commons Attribution Noncommercial 3.0 unported license https://creativecommons.org/licenses/by-nc/3.0/ application/pdf Royal Society of Chemistry (RSC)
spellingShingle Struble, Thomas J
Coley, Connor Wilson
Jensen, Klavs F
Multitask prediction of site selectivity in aromatic C-H functionalization reactions
title Multitask prediction of site selectivity in aromatic C-H functionalization reactions
title_full Multitask prediction of site selectivity in aromatic C-H functionalization reactions
title_fullStr Multitask prediction of site selectivity in aromatic C-H functionalization reactions
title_full_unstemmed Multitask prediction of site selectivity in aromatic C-H functionalization reactions
title_short Multitask prediction of site selectivity in aromatic C-H functionalization reactions
title_sort multitask prediction of site selectivity in aromatic c h functionalization reactions
url https://hdl.handle.net/1721.1/125612
work_keys_str_mv AT strublethomasj multitaskpredictionofsiteselectivityinaromaticchfunctionalizationreactions
AT coleyconnorwilson multitaskpredictionofsiteselectivityinaromaticchfunctionalizationreactions
AT jensenklavsf multitaskpredictionofsiteselectivityinaromaticchfunctionalizationreactions