Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches
The SARS-CoV-2 3CLpro protein is one of the key therapeutic targets of interest for COVID-19 due to its critical role in viral replication, various high-quality protein crystal structures, and as a basis for computationally screening for compounds with improved inhibitory activity, bioavailability,...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2023-01-01
|
Series: | PLoS ONE |
Online Access: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10479925/?tool=EBI |
_version_ | 1797690169747308544 |
---|---|
author | Faisal Bin Ashraf Sanjida Akter Sumona Hoque Mumu Muhammad Usama Islam Jasim Uddin |
author_facet | Faisal Bin Ashraf Sanjida Akter Sumona Hoque Mumu Muhammad Usama Islam Jasim Uddin |
author_sort | Faisal Bin Ashraf |
collection | DOAJ |
description | The SARS-CoV-2 3CLpro protein is one of the key therapeutic targets of interest for COVID-19 due to its critical role in viral replication, various high-quality protein crystal structures, and as a basis for computationally screening for compounds with improved inhibitory activity, bioavailability, and ADMETox properties. The ChEMBL and PubChem database contains experimental data from screening small molecules against SARS-CoV-2 3CLpro, which expands the opportunity to learn the pattern and design a computational model that can predict the potency of any drug compound against coronavirus before in-vitro and in-vivo testing. In this study, Utilizing several descriptors, we evaluated 27 machine learning classifiers. We also developed a neural network model that can correctly identify bioactive and inactive chemicals with 91% accuracy, on CheMBL data and 93% accuracy on combined data on both CheMBL and Pubchem. The F1-score for inactive and active compounds was 93% and 94%, respectively. SHAP (SHapley Additive exPlanations) on XGB classifier to find important fingerprints from the PaDEL descriptors for this task. The results indicated that the PaDEL descriptors were effective in predicting bioactivity, the proposed neural network design was efficient, and the Explanatory factor through SHAP correctly identified the important fingertips. In addition, we validated the effectiveness of our proposed model using a large dataset encompassing over 100,000 molecules. This research employed various molecular descriptors to discover the optimal one for this task. To evaluate the effectiveness of these possible medications against SARS-CoV-2, more in-vitro and in-vivo research is required. |
first_indexed | 2024-03-12T01:56:39Z |
format | Article |
id | doaj.art-e8e4f3d26872430cb94532b8c24c6324 |
institution | Directory Open Access Journal |
issn | 1932-6203 |
language | English |
last_indexed | 2024-03-12T01:56:39Z |
publishDate | 2023-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj.art-e8e4f3d26872430cb94532b8c24c63242023-09-08T05:30:58ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01189Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approachesFaisal Bin AshrafSanjida AkterSumona Hoque MumuMuhammad Usama IslamJasim UddinThe SARS-CoV-2 3CLpro protein is one of the key therapeutic targets of interest for COVID-19 due to its critical role in viral replication, various high-quality protein crystal structures, and as a basis for computationally screening for compounds with improved inhibitory activity, bioavailability, and ADMETox properties. The ChEMBL and PubChem database contains experimental data from screening small molecules against SARS-CoV-2 3CLpro, which expands the opportunity to learn the pattern and design a computational model that can predict the potency of any drug compound against coronavirus before in-vitro and in-vivo testing. In this study, Utilizing several descriptors, we evaluated 27 machine learning classifiers. We also developed a neural network model that can correctly identify bioactive and inactive chemicals with 91% accuracy, on CheMBL data and 93% accuracy on combined data on both CheMBL and Pubchem. The F1-score for inactive and active compounds was 93% and 94%, respectively. SHAP (SHapley Additive exPlanations) on XGB classifier to find important fingerprints from the PaDEL descriptors for this task. The results indicated that the PaDEL descriptors were effective in predicting bioactivity, the proposed neural network design was efficient, and the Explanatory factor through SHAP correctly identified the important fingertips. In addition, we validated the effectiveness of our proposed model using a large dataset encompassing over 100,000 molecules. This research employed various molecular descriptors to discover the optimal one for this task. To evaluate the effectiveness of these possible medications against SARS-CoV-2, more in-vitro and in-vivo research is required.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10479925/?tool=EBI |
spellingShingle | Faisal Bin Ashraf Sanjida Akter Sumona Hoque Mumu Muhammad Usama Islam Jasim Uddin Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches PLoS ONE |
title | Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches |
title_full | Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches |
title_fullStr | Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches |
title_full_unstemmed | Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches |
title_short | Bio-activity prediction of drug candidate compounds targeting SARS-Cov-2 using machine learning approaches |
title_sort | bio activity prediction of drug candidate compounds targeting sars cov 2 using machine learning approaches |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10479925/?tool=EBI |
work_keys_str_mv | AT faisalbinashraf bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches AT sanjidaakter bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches AT sumonahoquemumu bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches AT muhammadusamaislam bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches AT jasimuddin bioactivitypredictionofdrugcandidatecompoundstargetingsarscov2usingmachinelearningapproaches |