Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein–Ligand Structures: Towards Per-Target Scoring Functions

In recent years, machine learning has been proposed as a promising strategy to build accurate scoring functions for computational docking finalized to numerically empowered drug discovery. However, the latest studies have suggested that over-optimistic results had been reported due to the correlatio...

Full description

Bibliographic Details
Main Authors: Francesco Pellicani, Diego Dal Ben, Andrea Perali, Sebastiano Pilati
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Molecules
Subjects:
Online Access:https://www.mdpi.com/1420-3049/28/4/1661
_version_ 1797619015112196096
author Francesco Pellicani
Diego Dal Ben
Andrea Perali
Sebastiano Pilati
author_facet Francesco Pellicani
Diego Dal Ben
Andrea Perali
Sebastiano Pilati
author_sort Francesco Pellicani
collection DOAJ
description In recent years, machine learning has been proposed as a promising strategy to build accurate scoring functions for computational docking finalized to numerically empowered drug discovery. However, the latest studies have suggested that over-optimistic results had been reported due to the correlations present in the experimental databases used for training and testing. Here, we investigate the performance of an artificial neural network in binding affinity predictions, comparing results obtained using both experimental protein–ligand structures as well as larger sets of computer-generated structures created using commercial software. Interestingly, similar performances are obtained on both databases. We find a noticeable performance suppression when moving from random horizontal tests to vertical tests performed on target proteins not included in the training data. The possibility to train the network on relatively easily created computer-generated databases leads us to explore per-target scoring functions, trained and tested ad-hoc on complexes including only one target protein. Encouraging results are obtained, depending on the type of protein being addressed.
first_indexed 2024-03-11T08:22:03Z
format Article
id doaj.art-555bfdce9c09412791542ff39606fa73
institution Directory Open Access Journal
issn 1420-3049
language English
last_indexed 2024-03-11T08:22:03Z
publishDate 2023-02-01
publisher MDPI AG
record_format Article
series Molecules
spelling doaj.art-555bfdce9c09412791542ff39606fa732023-11-16T22:21:30ZengMDPI AGMolecules1420-30492023-02-01284166110.3390/molecules28041661Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein–Ligand Structures: Towards Per-Target Scoring FunctionsFrancesco Pellicani0Diego Dal Ben1Andrea Perali2Sebastiano Pilati3Physics Division, School of Science and Technology, University of Camerino, I-62032 Camerino, MC, ItalyMedicinal Chemistry Unit, School of Pharmacy, University of Camerino, I-62032 Camerino, MC, ItalyPhysics Unit, School of Pharmacy, University of Camerino, I-62032 Camerino, MC, ItalyPhysics Division, School of Science and Technology, University of Camerino, I-62032 Camerino, MC, ItalyIn recent years, machine learning has been proposed as a promising strategy to build accurate scoring functions for computational docking finalized to numerically empowered drug discovery. However, the latest studies have suggested that over-optimistic results had been reported due to the correlations present in the experimental databases used for training and testing. Here, we investigate the performance of an artificial neural network in binding affinity predictions, comparing results obtained using both experimental protein–ligand structures as well as larger sets of computer-generated structures created using commercial software. Interestingly, similar performances are obtained on both databases. We find a noticeable performance suppression when moving from random horizontal tests to vertical tests performed on target proteins not included in the training data. The possibility to train the network on relatively easily created computer-generated databases leads us to explore per-target scoring functions, trained and tested ad-hoc on complexes including only one target protein. Encouraging results are obtained, depending on the type of protein being addressed.https://www.mdpi.com/1420-3049/28/4/1661molecular dockingscoring functionsmachine learning
spellingShingle Francesco Pellicani
Diego Dal Ben
Andrea Perali
Sebastiano Pilati
Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein–Ligand Structures: Towards Per-Target Scoring Functions
Molecules
molecular docking
scoring functions
machine learning
title Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein–Ligand Structures: Towards Per-Target Scoring Functions
title_full Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein–Ligand Structures: Towards Per-Target Scoring Functions
title_fullStr Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein–Ligand Structures: Towards Per-Target Scoring Functions
title_full_unstemmed Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein–Ligand Structures: Towards Per-Target Scoring Functions
title_short Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-Generated Protein–Ligand Structures: Towards Per-Target Scoring Functions
title_sort machine learning scoring functions for drug discovery from experimental and computer generated protein ligand structures towards per target scoring functions
topic molecular docking
scoring functions
machine learning
url https://www.mdpi.com/1420-3049/28/4/1661
work_keys_str_mv AT francescopellicani machinelearningscoringfunctionsfordrugdiscoveryfromexperimentalandcomputergeneratedproteinligandstructurestowardspertargetscoringfunctions
AT diegodalben machinelearningscoringfunctionsfordrugdiscoveryfromexperimentalandcomputergeneratedproteinligandstructurestowardspertargetscoringfunctions
AT andreaperali machinelearningscoringfunctionsfordrugdiscoveryfromexperimentalandcomputergeneratedproteinligandstructurestowardspertargetscoringfunctions
AT sebastianopilati machinelearningscoringfunctionsfordrugdiscoveryfromexperimentalandcomputergeneratedproteinligandstructurestowardspertargetscoringfunctions