MetaClass, a Comprehensive Classification System for Predicting the Occurrence of Metabolic Reactions Based on the MetaQSAR Database

(1) Background: Machine learning algorithms are finding fruitful applications in predicting the ADME profile of new molecules, with a particular focus on metabolism predictions. However, the development of comprehensive metabolism predictors is hampered by the lack of highly accurate metabolic resou...

Full description

Bibliographic Details
Main Authors: Angelica Mazzolari, Alice Scaccabarozzi, Giulio Vistoli, Alessandro Pedretti
Format: Article
Language:English
Published: MDPI AG 2021-09-01
Series:Molecules
Subjects:
Online Access:https://www.mdpi.com/1420-3049/26/19/5857
_version_ 1797515939355295744
author Angelica Mazzolari
Alice Scaccabarozzi
Giulio Vistoli
Alessandro Pedretti
author_facet Angelica Mazzolari
Alice Scaccabarozzi
Giulio Vistoli
Alessandro Pedretti
author_sort Angelica Mazzolari
collection DOAJ
description (1) Background: Machine learning algorithms are finding fruitful applications in predicting the ADME profile of new molecules, with a particular focus on metabolism predictions. However, the development of comprehensive metabolism predictors is hampered by the lack of highly accurate metabolic resources. Hence, we recently proposed a manually curated metabolic database (MetaQSAR), the level of accuracy of which is well suited to the development of predictive models. (2) Methods: MetaQSAR was used to extract datasets to predict the metabolic reactions subdivided into major classes, classes and subclasses. The collected datasets comprised a total of 3788 first-generation metabolic reactions. Predictive models were developed by using standard random forest algorithms and sets of physicochemical, stereo-electronic and constitutional descriptors. (3) Results: The developed models showed satisfactory performance, especially for hydrolyses and conjugations, while redox reactions were predicted with greater difficulty, which was reasonable as they depend on many complex features that are not properly encoded by the included descriptors. (4) Conclusions: The generated models allowed a precise comparison of the propensity of each metabolic reaction to be predicted and the factors affecting their predictability were discussed in detail. Overall, the study led to the development of a freely downloadable global predictor, MetaClass, which correctly predicts 80% of the reported reactions, as assessed by an explorative validation analysis on an external dataset, with an overall MCC = 0.44.
first_indexed 2024-03-10T06:55:16Z
format Article
id doaj.art-b3421beac01a4bab97c779d574da7f0f
institution Directory Open Access Journal
issn 1420-3049
language English
last_indexed 2024-03-10T06:55:16Z
publishDate 2021-09-01
publisher MDPI AG
record_format Article
series Molecules
spelling doaj.art-b3421beac01a4bab97c779d574da7f0f2023-11-22T16:33:40ZengMDPI AGMolecules1420-30492021-09-012619585710.3390/molecules26195857MetaClass, a Comprehensive Classification System for Predicting the Occurrence of Metabolic Reactions Based on the MetaQSAR DatabaseAngelica Mazzolari0Alice Scaccabarozzi1Giulio Vistoli2Alessandro Pedretti3Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, ItalyDipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, ItalyDipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, ItalyDipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli 25, I-20133 Milano, Italy(1) Background: Machine learning algorithms are finding fruitful applications in predicting the ADME profile of new molecules, with a particular focus on metabolism predictions. However, the development of comprehensive metabolism predictors is hampered by the lack of highly accurate metabolic resources. Hence, we recently proposed a manually curated metabolic database (MetaQSAR), the level of accuracy of which is well suited to the development of predictive models. (2) Methods: MetaQSAR was used to extract datasets to predict the metabolic reactions subdivided into major classes, classes and subclasses. The collected datasets comprised a total of 3788 first-generation metabolic reactions. Predictive models were developed by using standard random forest algorithms and sets of physicochemical, stereo-electronic and constitutional descriptors. (3) Results: The developed models showed satisfactory performance, especially for hydrolyses and conjugations, while redox reactions were predicted with greater difficulty, which was reasonable as they depend on many complex features that are not properly encoded by the included descriptors. (4) Conclusions: The generated models allowed a precise comparison of the propensity of each metabolic reaction to be predicted and the factors affecting their predictability were discussed in detail. Overall, the study led to the development of a freely downloadable global predictor, MetaClass, which correctly predicts 80% of the reported reactions, as assessed by an explorative validation analysis on an external dataset, with an overall MCC = 0.44.https://www.mdpi.com/1420-3049/26/19/5857drug metabolismMetaQSARmetabolic reactionsmetabolism predictionclassification algorithmsrandom forest
spellingShingle Angelica Mazzolari
Alice Scaccabarozzi
Giulio Vistoli
Alessandro Pedretti
MetaClass, a Comprehensive Classification System for Predicting the Occurrence of Metabolic Reactions Based on the MetaQSAR Database
Molecules
drug metabolism
MetaQSAR
metabolic reactions
metabolism prediction
classification algorithms
random forest
title MetaClass, a Comprehensive Classification System for Predicting the Occurrence of Metabolic Reactions Based on the MetaQSAR Database
title_full MetaClass, a Comprehensive Classification System for Predicting the Occurrence of Metabolic Reactions Based on the MetaQSAR Database
title_fullStr MetaClass, a Comprehensive Classification System for Predicting the Occurrence of Metabolic Reactions Based on the MetaQSAR Database
title_full_unstemmed MetaClass, a Comprehensive Classification System for Predicting the Occurrence of Metabolic Reactions Based on the MetaQSAR Database
title_short MetaClass, a Comprehensive Classification System for Predicting the Occurrence of Metabolic Reactions Based on the MetaQSAR Database
title_sort metaclass a comprehensive classification system for predicting the occurrence of metabolic reactions based on the metaqsar database
topic drug metabolism
MetaQSAR
metabolic reactions
metabolism prediction
classification algorithms
random forest
url https://www.mdpi.com/1420-3049/26/19/5857
work_keys_str_mv AT angelicamazzolari metaclassacomprehensiveclassificationsystemforpredictingtheoccurrenceofmetabolicreactionsbasedonthemetaqsardatabase
AT alicescaccabarozzi metaclassacomprehensiveclassificationsystemforpredictingtheoccurrenceofmetabolicreactionsbasedonthemetaqsardatabase
AT giuliovistoli metaclassacomprehensiveclassificationsystemforpredictingtheoccurrenceofmetabolicreactionsbasedonthemetaqsardatabase
AT alessandropedretti metaclassacomprehensiveclassificationsystemforpredictingtheoccurrenceofmetabolicreactionsbasedonthemetaqsardatabase