Computational approaches to molecular fragment elaboration

<p>Recent advances in machine learning have led to considerable interest in their application to drug discovery, in the development of accurate predictive algorithms which can rapidly predict whether a given compound is likely to bind to a target protein, and in generative algorithms that prop...

Full description

Bibliographic Details
Main Author: Hadfield, T
Other Authors: Deane, C
Format: Thesis
Language:English
Published: 2022
_version_ 1826310245741232128
author Hadfield, T
author2 Deane, C
author_facet Deane, C
Hadfield, T
author_sort Hadfield, T
collection OXFORD
description <p>Recent advances in machine learning have led to considerable interest in their application to drug discovery, in the development of accurate predictive algorithms which can rapidly predict whether a given compound is likely to bind to a target protein, and in generative algorithms that propose novel high affinity binders in <i>silico</i>.</p> <p>In this thesis, we propose a deep generative model for fragment elaboration for use in fragment-to-lead campaigns. We consider the problem of generating elaborations which are highly similar to a ground truth elaboration, first via reinforcement learning, where we propose a novel curriculum-based reward function, and also via the imposition of physically meaningful pharmacophoric constraints. We describe the development of a simple-to-use web application, allowing users to generate molecules with no prior programming experience and without installing anything.</p> <p>Next, we describe a novel method for extracting useful and interpretable information from a target protein and providing it to a deep generative model. By computing a fragment hotspot map of the protein, which describes regions of the binding pocket that are likely to make a disproportionate contribution to binding affinity, we derive a set of pharmacophoric constraints which can be passed to our deep generative model, helping it generate functional groups which are likely to interact with the protein.</p> <p>Finally, we consider the problem of virtual screening. In addition to being able to accurately identify high affinity binders, it is desirable that virtual screening models are able to identify the functional groups most responsible for binding. Assessing the ability of models to accurately attribute binding to specific functional groups is challenging due to the difficulty in specifying a ground truth which model attributions can be compared to. To address this, we propose a novel synthetic data generation process with a deterministic binding rule. First, we show that a recently proposed deep learning-based model is better able to identify important functional groups than a fingerprint-based model. Second, we demonstrate that training models on datasets which exhibit ligand-specific bias degrades the ability of the model to identify important functional groups, underscoring the importance of curating high quality datasets for virtual screening.</p>
first_indexed 2024-03-07T07:49:05Z
format Thesis
id oxford-uuid:d75f615e-a2d5-4ad3-bed1-5b3eb09abef4
institution University of Oxford
language English
last_indexed 2024-03-07T07:49:05Z
publishDate 2022
record_format dspace
spelling oxford-uuid:d75f615e-a2d5-4ad3-bed1-5b3eb09abef42023-06-27T11:41:05ZComputational approaches to molecular fragment elaborationThesishttp://purl.org/coar/resource_type/c_db06uuid:d75f615e-a2d5-4ad3-bed1-5b3eb09abef4EnglishHyrax Deposit2022Hadfield, TDeane, CMorris, G<p>Recent advances in machine learning have led to considerable interest in their application to drug discovery, in the development of accurate predictive algorithms which can rapidly predict whether a given compound is likely to bind to a target protein, and in generative algorithms that propose novel high affinity binders in <i>silico</i>.</p> <p>In this thesis, we propose a deep generative model for fragment elaboration for use in fragment-to-lead campaigns. We consider the problem of generating elaborations which are highly similar to a ground truth elaboration, first via reinforcement learning, where we propose a novel curriculum-based reward function, and also via the imposition of physically meaningful pharmacophoric constraints. We describe the development of a simple-to-use web application, allowing users to generate molecules with no prior programming experience and without installing anything.</p> <p>Next, we describe a novel method for extracting useful and interpretable information from a target protein and providing it to a deep generative model. By computing a fragment hotspot map of the protein, which describes regions of the binding pocket that are likely to make a disproportionate contribution to binding affinity, we derive a set of pharmacophoric constraints which can be passed to our deep generative model, helping it generate functional groups which are likely to interact with the protein.</p> <p>Finally, we consider the problem of virtual screening. In addition to being able to accurately identify high affinity binders, it is desirable that virtual screening models are able to identify the functional groups most responsible for binding. Assessing the ability of models to accurately attribute binding to specific functional groups is challenging due to the difficulty in specifying a ground truth which model attributions can be compared to. To address this, we propose a novel synthetic data generation process with a deterministic binding rule. First, we show that a recently proposed deep learning-based model is better able to identify important functional groups than a fingerprint-based model. Second, we demonstrate that training models on datasets which exhibit ligand-specific bias degrades the ability of the model to identify important functional groups, underscoring the importance of curating high quality datasets for virtual screening.</p>
spellingShingle Hadfield, T
Computational approaches to molecular fragment elaboration
title Computational approaches to molecular fragment elaboration
title_full Computational approaches to molecular fragment elaboration
title_fullStr Computational approaches to molecular fragment elaboration
title_full_unstemmed Computational approaches to molecular fragment elaboration
title_short Computational approaches to molecular fragment elaboration
title_sort computational approaches to molecular fragment elaboration
work_keys_str_mv AT hadfieldt computationalapproachestomolecularfragmentelaboration