Structure-based ligand discovery: elaborating fragment hits in silico
<p>The high attrition rates of drug discovery have been a key motivation for the development of computational methods to design better, safer, more promising molecules. Hit-to-lead development is often driven by the subjective decisions of medicinal chemists. There is growing interest in devel...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | English |
Published: |
2019
|
_version_ | 1826281120702922752 |
---|---|
author | Leung, SH |
author2 | Morris, G |
author_facet | Morris, G Leung, SH |
author_sort | Leung, SH |
collection | OXFORD |
description | <p>The high attrition rates of drug discovery have been a key motivation for the development of computational methods to design better, safer, more promising molecules. Hit-to-lead development is often driven by the subjective decisions of medicinal chemists. There is growing interest in developing more objective tools to explore the vastness of chemical space. Fragment-based drug discovery offers the advantage of a high coverage of chemical space through the identification of fragment hits, which are typically more weakly binding and smaller than drug-like molecules. The process of fragment elaboration aims to increase their potency and potentially other objectives. However, currently there is no agreed best method for how to elaborate fragment hits using all structural information acquired.</p>
<p>This thesis describes the development of computational methods to tackle this problem. I firstly propose a workflow and describe how I applied it to a prospective study involving nudix hydrolase NUDT7, a target of interest to the Structural Genomics Consortium. The workflow involves reaction enumeration, protein-ligand docking and selection of candidates that show a conserved binding pose in their docking results. In the prospective study, I developed four hypotheses based on potential protein-ligand interactions, to further select the candidates. As a result, 105 amides were prioritised and synthesised using semi-automated synthesis. I then soaked 78 crude reactions into crystals of the target protein, which resulted in six protein-ligand crystal structures, five of which were novel. During application of this workflow, I used RMSD of the common substructures to measure the conservation of binding mode; however, I proposed that this may not be the most appropriate measure for comparisons between fragments and their elaborated counterparts. Hence, this was the motivation for the development of SuCOS.
</p>
<p>SuCOS is an open-source combined shape and chemical feature overlap score. Through three studies, I compared the use of SuCOS to RMSD and protein-ligand interaction fingerprints (PLIFs) and explored the strengths and weaknesses of each, using a dataset of X-ray crystal structures of paired elaborated larger and smaller molecules bound to the same protein. My redocking and cross-docking studies showed that SuCOS had notable advantages over RMSD and PLIF similarity. I also showed that reranking with SuCOS performed better than the native AutoDock Vina score at differentiating actives from decoy ligands using the DUD-E dataset. As SuCOS is measured between one reference and one query molecule, I investigated the use of two group fusion methods – cumulative and max – when there are multiple reference structures, which is often the case after a fragment screening campaign. However, there was no group fusion method that consistently performed best for the four target datasets I validated on.</p>
<p>Finally, I investigated the use of Bayesian optimisation for ligand-based and structure- based virtual screening. Optimisation was performed over discrete chemical space, so essentially prioritises which molecules to make next from an input set. I investigated the influence of different molecular representations and different kernels on the performance of Bayesian optimisation. For the two ligand-based experiments, Morgan fingerprints with the Tanimoto kernel showed the best performance. For the structure- based Bayesian optimisation experiments, I investigated two structure-based representations: vectorised RDKit pharmacophoric feature maps and PLIFs. However, the results showed that there was no clear advantage to using either structure-based representation over 2D fingerprints such as Morgan fingerprints.</p> |
first_indexed | 2024-03-07T00:24:01Z |
format | Thesis |
id | oxford-uuid:7d8137ad-8331-4733-88e2-1ee1b55480fb |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T00:24:01Z |
publishDate | 2019 |
record_format | dspace |
spelling | oxford-uuid:7d8137ad-8331-4733-88e2-1ee1b55480fb2022-03-26T21:04:03ZStructure-based ligand discovery: elaborating fragment hits in silicoThesishttp://purl.org/coar/resource_type/c_db06uuid:7d8137ad-8331-4733-88e2-1ee1b55480fbEnglishHyrax Deposit2019Leung, SHMorris, GBrennan, P<p>The high attrition rates of drug discovery have been a key motivation for the development of computational methods to design better, safer, more promising molecules. Hit-to-lead development is often driven by the subjective decisions of medicinal chemists. There is growing interest in developing more objective tools to explore the vastness of chemical space. Fragment-based drug discovery offers the advantage of a high coverage of chemical space through the identification of fragment hits, which are typically more weakly binding and smaller than drug-like molecules. The process of fragment elaboration aims to increase their potency and potentially other objectives. However, currently there is no agreed best method for how to elaborate fragment hits using all structural information acquired.</p> <p>This thesis describes the development of computational methods to tackle this problem. I firstly propose a workflow and describe how I applied it to a prospective study involving nudix hydrolase NUDT7, a target of interest to the Structural Genomics Consortium. The workflow involves reaction enumeration, protein-ligand docking and selection of candidates that show a conserved binding pose in their docking results. In the prospective study, I developed four hypotheses based on potential protein-ligand interactions, to further select the candidates. As a result, 105 amides were prioritised and synthesised using semi-automated synthesis. I then soaked 78 crude reactions into crystals of the target protein, which resulted in six protein-ligand crystal structures, five of which were novel. During application of this workflow, I used RMSD of the common substructures to measure the conservation of binding mode; however, I proposed that this may not be the most appropriate measure for comparisons between fragments and their elaborated counterparts. Hence, this was the motivation for the development of SuCOS. </p> <p>SuCOS is an open-source combined shape and chemical feature overlap score. Through three studies, I compared the use of SuCOS to RMSD and protein-ligand interaction fingerprints (PLIFs) and explored the strengths and weaknesses of each, using a dataset of X-ray crystal structures of paired elaborated larger and smaller molecules bound to the same protein. My redocking and cross-docking studies showed that SuCOS had notable advantages over RMSD and PLIF similarity. I also showed that reranking with SuCOS performed better than the native AutoDock Vina score at differentiating actives from decoy ligands using the DUD-E dataset. As SuCOS is measured between one reference and one query molecule, I investigated the use of two group fusion methods – cumulative and max – when there are multiple reference structures, which is often the case after a fragment screening campaign. However, there was no group fusion method that consistently performed best for the four target datasets I validated on.</p> <p>Finally, I investigated the use of Bayesian optimisation for ligand-based and structure- based virtual screening. Optimisation was performed over discrete chemical space, so essentially prioritises which molecules to make next from an input set. I investigated the influence of different molecular representations and different kernels on the performance of Bayesian optimisation. For the two ligand-based experiments, Morgan fingerprints with the Tanimoto kernel showed the best performance. For the structure- based Bayesian optimisation experiments, I investigated two structure-based representations: vectorised RDKit pharmacophoric feature maps and PLIFs. However, the results showed that there was no clear advantage to using either structure-based representation over 2D fingerprints such as Morgan fingerprints.</p> |
spellingShingle | Leung, SH Structure-based ligand discovery: elaborating fragment hits in silico |
title | Structure-based ligand discovery: elaborating fragment hits in silico |
title_full | Structure-based ligand discovery: elaborating fragment hits in silico |
title_fullStr | Structure-based ligand discovery: elaborating fragment hits in silico |
title_full_unstemmed | Structure-based ligand discovery: elaborating fragment hits in silico |
title_short | Structure-based ligand discovery: elaborating fragment hits in silico |
title_sort | structure based ligand discovery elaborating fragment hits in silico |
work_keys_str_mv | AT leungsh structurebasedliganddiscoveryelaboratingfragmenthitsinsilico |