Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches

The SARS-CoV-2 3CLpro protein is one of the key therapeutic targets of interest for COVID-19 due to its critical role in viral replication, various high-quality protein crystal structures, and as a basis for computationally screening for compounds with improved inhibitory activity, bioavailability,...

Full description

Bibliographic Details
Main Authors: Faisal Bin Ashraf, Sanjida Akter, Sumona Hoque Mumu, Muhammad Usama Islam, Jasim Uddin
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10479925/?tool=EBI
_version_ 1797690169747308544
author Faisal Bin Ashraf
Sanjida Akter
Sumona Hoque Mumu
Muhammad Usama Islam
Jasim Uddin
author_facet Faisal Bin Ashraf
Sanjida Akter
Sumona Hoque Mumu
Muhammad Usama Islam
Jasim Uddin
author_sort Faisal Bin Ashraf
collection DOAJ
description The SARS-CoV-2 3CLpro protein is one of the key therapeutic targets of interest for COVID-19 due to its critical role in viral replication, various high-quality protein crystal structures, and as a basis for computationally screening for compounds with improved inhibitory activity, bioavailability, and ADMETox properties. The ChEMBL and PubChem database contains experimental data from screening small molecules against SARS-CoV-2 3CLpro, which expands the opportunity to learn the pattern and design a computational model that can predict the potency of any drug compound against coronavirus before in-vitro and in-vivo testing. In this study, Utilizing several descriptors, we evaluated 27 machine learning classifiers. We also developed a neural network model that can correctly identify bioactive and inactive chemicals with 91% accuracy, on CheMBL data and 93% accuracy on combined data on both CheMBL and Pubchem. The F1-score for inactive and active compounds was 93% and 94%, respectively. SHAP (SHapley Additive exPlanations) on XGB classifier to find important fingerprints from the PaDEL descriptors for this task. The results indicated that the PaDEL descriptors were effective in predicting bioactivity, the proposed neural network design was efficient, and the Explanatory factor through SHAP correctly identified the important fingertips. In addition, we validated the effectiveness of our proposed model using a large dataset encompassing over 100,000 molecules. This research employed various molecular descriptors to discover the optimal one for this task. To evaluate the effectiveness of these possible medications against SARS-CoV-2, more in-vitro and in-vivo research is required.
first_indexed 2024-03-12T01:56:39Z
format Article
id doaj.art-e8e4f3d26872430cb94532b8c24c6324
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-03-12T01:56:39Z
publishDate 2023-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-e8e4f3d26872430cb94532b8c24c63242023-09-08T05:30:58ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01189Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approachesFaisal Bin AshrafSanjida AkterSumona Hoque MumuMuhammad Usama IslamJasim UddinThe SARS-CoV-2 3CLpro protein is one of the key therapeutic targets of interest for COVID-19 due to its critical role in viral replication, various high-quality protein crystal structures, and as a basis for computationally screening for compounds with improved inhibitory activity, bioavailability, and ADMETox properties. The ChEMBL and PubChem database contains experimental data from screening small molecules against SARS-CoV-2 3CLpro, which expands the opportunity to learn the pattern and design a computational model that can predict the potency of any drug compound against coronavirus before in-vitro and in-vivo testing. In this study, Utilizing several descriptors, we evaluated 27 machine learning classifiers. We also developed a neural network model that can correctly identify bioactive and inactive chemicals with 91% accuracy, on CheMBL data and 93% accuracy on combined data on both CheMBL and Pubchem. The F1-score for inactive and active compounds was 93% and 94%, respectively. SHAP (SHapley Additive exPlanations) on XGB classifier to find important fingerprints from the PaDEL descriptors for this task. The results indicated that the PaDEL descriptors were effective in predicting bioactivity, the proposed neural network design was efficient, and the Explanatory factor through SHAP correctly identified the important fingertips. In addition, we validated the effectiveness of our proposed model using a large dataset encompassing over 100,000 molecules. This research employed various molecular descriptors to discover the optimal one for this task. To evaluate the effectiveness of these possible medications against SARS-CoV-2, more in-vitro and in-vivo research is required.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10479925/?tool=EBI
spellingShingle Faisal Bin Ashraf
Sanjida Akter
Sumona Hoque Mumu
Muhammad Usama Islam
Jasim Uddin
Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches
PLoS ONE
title Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches
title_full Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches
title_fullStr Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches
title_full_unstemmed Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches
title_short Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches
title_sort bio activity prediction of drug candidate compounds targeting sars cov 2 using machine learning approaches
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10479925/?tool=EBI
work_keys_str_mv AT faisalbinashraf bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches
AT sanjidaakter bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches
AT sumonahoquemumu bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches
AT muhammadusamaislam bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches
AT jasimuddin bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches