Summary: | Reaction prediction is a fundamental problem in chemistry. Previous work has mostly targeted chemical product prediction only, but did not elucidate any mechanisms or elementary steps from which the reaction proceeds. Here, we attempt to predict chemical mechanisms via deep reinforcement learning.
We first define a new type of graph molecular representation that can better keep track of electron flow and can be generalized to non-traditional bonding, such as 3- center-4-electron bonds. We then define a molecular environment Markov Decision Process (MDP) that codifies the allowed mechanistic steps and evaluates them by utilizing a thermodynamic energy oracle as the reward function. To solve this environment, we build a graph neural network-based policy and value network, where the policy network is first pre-trained on an open database of elementary radical reactions (RMechDB). Then, we use proximal policy optimization (PPO) to fine-tune the model and predict reasonable reaction mechanisms for two case studies: a radical oxidation of the terpene limonene, and a radical cyclization cascade in the synthesis of hirsutene.
|