Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold

Human immunodeficiency virus, a retrovirus, causes AIDS, a chronic immune system disease. HIV interferes with the ability of our body to combat disease and infection by weakening our immune system. An essential enzyme necessary for HIV replication is reverse transcriptase (RT). RT inhibitors (RTIs)...

Full description

Bibliographic Details
Main Authors: Komal Singh, Irina Ghosh, Venkatesan Jayaprakash, Sudeepan Jayapalan
Format: Article
Language:English
Published: Elsevier 2024-08-01
Series:European Journal of Medicinal Chemistry Reports
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2772417424000207
_version_ 1797242979778297856
author Komal Singh
Irina Ghosh
Venkatesan Jayaprakash
Sudeepan Jayapalan
author_facet Komal Singh
Irina Ghosh
Venkatesan Jayaprakash
Sudeepan Jayapalan
author_sort Komal Singh
collection DOAJ
description Human immunodeficiency virus, a retrovirus, causes AIDS, a chronic immune system disease. HIV interferes with the ability of our body to combat disease and infection by weakening our immune system. An essential enzyme necessary for HIV replication is reverse transcriptase (RT). RT inhibitors (RTIs) are a class of antiretroviral drugs that target HIV's RT enzyme, blocking its ability to convert viral RNA into DNA. The RT-1 enzyme has been found to be inhibited by imidazole. It attaches to the RT-1 enzyme's active site and prevents it from performing its usual activity. As a result, viral replication is inhibited, which can eventually aid in slowing the course of HIV and other retroviral diseases. A computational tool allows researchers to simulate and analyze the drug's behaviour in a virtual environment, providing valuable insights into its pharmacological properties, efficacy, and safety. QSAR modelling uses machine learning methods to create predictive models from datasets of chemical substances and the accompanying biological activity. Here, a comparative analysis of the model performances by four different algorithms for the Imidazole scaffold are reported. The algorithms of Support Vector Regression (SVR), Random Forest Regression (RFR), Decision Tree Regression (DTR) and Hist Gradient Boosting Regression (HGBR) have given promising results with the R2 value of 0.905, 0.993, 0.688 and 0.921 respectively for the train sets and for the test set 0.843, 0.977, 0.567 and 0.880. The best performed RFR model have been validated using developed RFR codes for randomly selected compounds and it shows the error percentage of about 0.151% only. From the R2 values, it is observed that the RFR and HGBR models show a better fit with the variables compared to the other models thereby making them the potential models for predicting the activity of novel anti-viral compounds.
first_indexed 2024-04-24T18:47:50Z
format Article
id doaj.art-c8febfac4b6e488d91e003d8c9ea495c
institution Directory Open Access Journal
issn 2772-4174
language English
last_indexed 2024-04-24T18:47:50Z
publishDate 2024-08-01
publisher Elsevier
record_format Article
series European Journal of Medicinal Chemistry Reports
spelling doaj.art-c8febfac4b6e488d91e003d8c9ea495c2024-03-27T04:53:16ZengElsevierEuropean Journal of Medicinal Chemistry Reports2772-41742024-08-0111100148Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffoldKomal Singh0Irina Ghosh1Venkatesan Jayaprakash2Sudeepan Jayapalan3Department of Pharmaceutical Sciences and Technology, Birla Institute of Technology, Mesra, Ranchi, IndiaDepartment of Pharmaceutical Sciences and Technology, Birla Institute of Technology, Mesra, Ranchi, IndiaDepartment of Pharmaceutical Sciences and Technology, Birla Institute of Technology, Mesra, Ranchi, IndiaDepartment of Chemical Engineering, Birla Institute of Technology, Mesra, Ranchi, India; Corresponding author.Human immunodeficiency virus, a retrovirus, causes AIDS, a chronic immune system disease. HIV interferes with the ability of our body to combat disease and infection by weakening our immune system. An essential enzyme necessary for HIV replication is reverse transcriptase (RT). RT inhibitors (RTIs) are a class of antiretroviral drugs that target HIV's RT enzyme, blocking its ability to convert viral RNA into DNA. The RT-1 enzyme has been found to be inhibited by imidazole. It attaches to the RT-1 enzyme's active site and prevents it from performing its usual activity. As a result, viral replication is inhibited, which can eventually aid in slowing the course of HIV and other retroviral diseases. A computational tool allows researchers to simulate and analyze the drug's behaviour in a virtual environment, providing valuable insights into its pharmacological properties, efficacy, and safety. QSAR modelling uses machine learning methods to create predictive models from datasets of chemical substances and the accompanying biological activity. Here, a comparative analysis of the model performances by four different algorithms for the Imidazole scaffold are reported. The algorithms of Support Vector Regression (SVR), Random Forest Regression (RFR), Decision Tree Regression (DTR) and Hist Gradient Boosting Regression (HGBR) have given promising results with the R2 value of 0.905, 0.993, 0.688 and 0.921 respectively for the train sets and for the test set 0.843, 0.977, 0.567 and 0.880. The best performed RFR model have been validated using developed RFR codes for randomly selected compounds and it shows the error percentage of about 0.151% only. From the R2 values, it is observed that the RFR and HGBR models show a better fit with the variables compared to the other models thereby making them the potential models for predicting the activity of novel anti-viral compounds.http://www.sciencedirect.com/science/article/pii/S2772417424000207Reverse transcriptaseImidazoleQSAR modellingMachine learningR2 value
spellingShingle Komal Singh
Irina Ghosh
Venkatesan Jayaprakash
Sudeepan Jayapalan
Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold
European Journal of Medicinal Chemistry Reports
Reverse transcriptase
Imidazole
QSAR modelling
Machine learning
R2 value
title Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold
title_full Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold
title_fullStr Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold
title_full_unstemmed Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold
title_short Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold
title_sort building a ml based qsar model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold
topic Reverse transcriptase
Imidazole
QSAR modelling
Machine learning
R2 value
url http://www.sciencedirect.com/science/article/pii/S2772417424000207
work_keys_str_mv AT komalsingh buildingamlbasedqsarmodelforpredictingthebioactivityoftherapeuticallyactivedrugclasswithimidazolescaffold
AT irinaghosh buildingamlbasedqsarmodelforpredictingthebioactivityoftherapeuticallyactivedrugclasswithimidazolescaffold
AT venkatesanjayaprakash buildingamlbasedqsarmodelforpredictingthebioactivityoftherapeuticallyactivedrugclasswithimidazolescaffold
AT sudeepanjayapalan buildingamlbasedqsarmodelforpredictingthebioactivityoftherapeuticallyactivedrugclasswithimidazolescaffold