Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction

Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug...

Full description

Bibliographic Details
Main Authors: Wee, Junjie, Xia, Kelin
Other Authors: School of Physical and Mathematical Sciences
Format: Journal Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/168978
_version_ 1826124629103607808
author Wee, Junjie
Xia, Kelin
author2 School of Physical and Mathematical Sciences
author_facet School of Physical and Mathematical Sciences
Wee, Junjie
Xia, Kelin
author_sort Wee, Junjie
collection NTU
description Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug design is molecular featurization, which is to identify or design appropriate molecular descriptors or fingerprints. Efficient and transferable molecular descriptors are key to the success of all AI-based drug design models. Here we propose Forman persistent Ricci curvature (FPRC)-based molecular featurization and feature engineering, for the first time. Molecular structures and interactions are modeled as simplicial complexes, which are generalization of graphs to their higher dimensional counterparts. Further, a multiscale representation is achieved through a filtration process, during which a series of nested simplicial complexes at different scales are generated. Forman Ricci curvatures (FRCs) are calculated on the series of simplicial complexes, and the persistence and variation of FRCs during the filtration process is defined as FPRC. Moreover, persistent attributes, which are FPRC-based functions and properties, are employed as molecular descriptors, and combined with machine learning models, in particular, gradient boosting tree (GBT). Our FPRC-GBT models are extensively trained and tested on three most commonly-used datasets, including PDBbind-2007, PDBbind-2013 and PDBbind-2016. It has been found that our results are better than the ones from machine learning models with traditional molecular descriptors.
first_indexed 2024-10-01T06:23:50Z
format Journal Article
id ntu-10356/168978
institution Nanyang Technological University
language English
last_indexed 2024-10-01T06:23:50Z
publishDate 2023
record_format dspace
spelling ntu-10356/1689782023-06-27T01:48:27Z Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction Wee, Junjie Xia, Kelin School of Physical and Mathematical Sciences Science::Mathematics Forman Ricci Curvature Molecular Featurization Machine Learning Drug Design Artificial intelligence (AI) techniques have already been gradually applied to the entire drug design process, from target discovery, lead discovery, lead optimization and preclinical development to the final three phases of clinical trials. Currently, one of the central challenges for AI-based drug design is molecular featurization, which is to identify or design appropriate molecular descriptors or fingerprints. Efficient and transferable molecular descriptors are key to the success of all AI-based drug design models. Here we propose Forman persistent Ricci curvature (FPRC)-based molecular featurization and feature engineering, for the first time. Molecular structures and interactions are modeled as simplicial complexes, which are generalization of graphs to their higher dimensional counterparts. Further, a multiscale representation is achieved through a filtration process, during which a series of nested simplicial complexes at different scales are generated. Forman Ricci curvatures (FRCs) are calculated on the series of simplicial complexes, and the persistence and variation of FRCs during the filtration process is defined as FPRC. Moreover, persistent attributes, which are FPRC-based functions and properties, are employed as molecular descriptors, and combined with machine learning models, in particular, gradient boosting tree (GBT). Our FPRC-GBT models are extensively trained and tested on three most commonly-used datasets, including PDBbind-2007, PDBbind-2013 and PDBbind-2016. It has been found that our results are better than the ones from machine learning models with traditional molecular descriptors. Ministry of Education (MOE) Nanyang Technological University Startup (supported in part through Grant M4081842.110); Singapore Ministry of Education Academic Research fund (Tier 1 RG109/19 and Tier 2 MOE2018-T2-1-033). 2023-06-27T01:48:27Z 2023-06-27T01:48:27Z 2021 Journal Article Wee, J. & Xia, K. (2021). Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction. Briefings in Bioinformatics, 22(6), bbab136-. https://dx.doi.org/10.1093/bib/bbab136 1467-5463 https://hdl.handle.net/10356/168978 10.1093/bib/bbab136 22 2-s2.0-85111173404 6 22 bbab136 en M4081842.110 RG109/19 MOE2018-T2-1-033 Briefings in Bioinformatics 10.21979/N9/ZTA5MN © 2021 The Author(s). Published by Oxford University Press. All rights reserved
spellingShingle Science::Mathematics
Forman Ricci Curvature
Molecular Featurization
Machine Learning
Drug Design
Wee, Junjie
Xia, Kelin
Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_full Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_fullStr Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_full_unstemmed Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_short Forman persistent Ricci curvature (FPRC)-based machine learning models for protein-ligand binding affinity prediction
title_sort forman persistent ricci curvature fprc based machine learning models for protein ligand binding affinity prediction
topic Science::Mathematics
Forman Ricci Curvature
Molecular Featurization
Machine Learning
Drug Design
url https://hdl.handle.net/10356/168978
work_keys_str_mv AT weejunjie formanpersistentriccicurvaturefprcbasedmachinelearningmodelsforproteinligandbindingaffinityprediction
AT xiakelin formanpersistentriccicurvaturefprcbasedmachinelearningmodelsforproteinligandbindingaffinityprediction