Riassunto: | <p>There has been a growth of interest in de novo molecular design methods in recent years, thanks to exciting new developments in artificial intelligence, and deep learning in particular. Molecule Deep Q-Networks (MolDQN) is a de novo algorithm developed by Zhou et al. in 2019 that combines simple chemical actions – adding or removing atoms or bonds – together with reinforcement learning. Consequently, MolDQN enables “local” search of chemical space by enabling smaller chemical modifications. However, often MolDQN’s atom/bond-based actions result in chemically infeasible or synthetically inaccessible molecules. To address this shortcoming, I developed new reinforcement learning-based de novo methods as described in Chapter 2: Reaction Deep Q-Networks (RxnDQN) and Reaction-Molecule Deep Q-Networks (RxnMolDQN). These methods introduced new action spaces that are based on available reagents and the most widely used, high-yield reactions used in medicinal chemistry. Therefore, my new action spaces enable “global” search of chemical space by performing larger chemical modifications.</p>
<p>In contrast to the atom- and bond-based action space of MolDQN, the action space of RxnDQN incorporates the top ten reactions used in medicinal chemistry as reported by Roughley et al. in 2011. I hypothesized that this would enable more synthetically accessible molecules to be generated. I also tested whether hybridizing these two action spaces in RxnMolDQN would enhance the exploration of chemical space by enabling both “local” and “global” modifications in chemical space.</p>
<p>In Chapter 3, I also explored novel structure-based variants by modifying the reward function to accommodate protein-ligand interactions in a protein binding site of interest. This was achieved by incorporating protein-ligand docking into the DQN methods (MolDQN, RxnDQN, and RxnMolDQN), in my new non-constrained structure-based de novo methods (i.e., nc3D-DQN methods): non-constrained three-dimensional Molecule Deep Q-Networks (nc3D-MolDQN); non-constrained three-dimensional Reaction Deep Q-Networks (nc3D-RxnDQN); non-constrained three-dimensional Reaction-Molecule Deep Q-Networks (nc3D-RxnMolDQN). </p>
<p>I further extended these new non-constrained structure-based de novo methods by incorporating constrained docking (i.e., c3D-DQN methods) in Chapter 4: c3D-MolDQN, c3D-RxnDQN, and c3D-RxnMolDQN where “c” implies constrained, which allow for de novo covalent ligand design, as well as opening up the potential for fragment-based de novo design.</p>
<p>The 2D-DQN methods (i.e., MolDQN, RxnDQN, and RxnMolDQN) were tested using 100 clinically approved drugs from DrugBank. The Malhotra set is a 297 collection of substructurally related ligand-pairs solved in complex with the same protein partner, which was used in this thesis to design the larger ligand from the smaller ligand. The 3D-DQN methods (i.e., nc3D-DQN and c3D-DQN methods) were tested using 51 ligand-pairs from the Malhotra set and 17 covalent ligands for SARS-CoV-2 main protease (M<sup>pro</sup>) from Fragalysis, respectively. Fragalysis is a cloud-based platform that has reported X-ray crystal structures of SARS-CoV-2 M<sup>pro</sup>. I also introduced a new pharmacophoric-based reward function using normalized SuCOS (2019) in the nc3D-DQN methods, which was tested using 45 ligand-pairs from the Malhotra set.</p>
|