iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models

Background Dihydrouridine (D) is one of the most significant uridine modifications that have a prominent occurrence in eukaryotes. The folding and conformational flexibility of transfer RNA (tRNA) can be attained through this modification. Objective The modification also triggers lung cancer in huma...

Full description

Bibliographic Details
Main Authors: Muhammad Taseer Suleman, Fahad Alturise, Tamim Alkhalifah, Yaser Daanial Khan
Format: Article
Language:English
Published: SAGE Publishing 2023-03-01
Series:Digital Health
Online Access:https://doi.org/10.1177/20552076231165963
_version_ 1797856135499218944
author Muhammad Taseer Suleman
Fahad Alturise
Tamim Alkhalifah
Yaser Daanial Khan
author_facet Muhammad Taseer Suleman
Fahad Alturise
Tamim Alkhalifah
Yaser Daanial Khan
author_sort Muhammad Taseer Suleman
collection DOAJ
description Background Dihydrouridine (D) is one of the most significant uridine modifications that have a prominent occurrence in eukaryotes. The folding and conformational flexibility of transfer RNA (tRNA) can be attained through this modification. Objective The modification also triggers lung cancer in humans. The identification of D sites was carried out through conventional laboratory methods; however, those were costly and time-consuming. The readiness of RNA sequences helps in the identification of D sites through computationally intelligent models. However, the most challenging part is turning these biological sequences into distinct vectors. Methods The current research proposed novel feature extraction mechanisms and the identification of D sites in tRNA sequences using ensemble models. The ensemble models were then subjected to evaluation using k-fold cross-validation and independent testing. Results The results revealed that the stacking ensemble model outperformed all the ensemble models by revealing 0.98 accuracy, 0.98 specificity, 0.97 sensitivity, and 0.92 Matthews Correlation Coefficient. The proposed model, iDHU-Ensem, was also compared with pre-existing predictors using an independent test. The accuracy scores have shown that the proposed model in this research study performed better than the available predictors. Conclusion The current research contributed towards the enhancement of D site identification capabilities through computationally intelligent methods. A web-based server, iDHU-Ensem, was also made available for the researchers at https://taseersuleman-idhu-ensem-idhu-ensem.streamlit.app/ .
first_indexed 2024-04-09T20:35:28Z
format Article
id doaj.art-379482fb27fa4a7eb97c01bdd875b088
institution Directory Open Access Journal
issn 2055-2076
language English
last_indexed 2024-04-09T20:35:28Z
publishDate 2023-03-01
publisher SAGE Publishing
record_format Article
series Digital Health
spelling doaj.art-379482fb27fa4a7eb97c01bdd875b0882023-03-30T09:34:50ZengSAGE PublishingDigital Health2055-20762023-03-01910.1177/20552076231165963iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning modelsMuhammad Taseer Suleman0Fahad Alturise1 Tamim Alkhalifah2Yaser Daanial Khan3 Department of Computer Science, School of systems and technology, , Lahore, Pakistan Department of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass, Qassim, Saudi Arabia Department of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass, Qassim, Saudi Arabia Department of Computer Science, School of systems and technology, , Lahore, PakistanBackground Dihydrouridine (D) is one of the most significant uridine modifications that have a prominent occurrence in eukaryotes. The folding and conformational flexibility of transfer RNA (tRNA) can be attained through this modification. Objective The modification also triggers lung cancer in humans. The identification of D sites was carried out through conventional laboratory methods; however, those were costly and time-consuming. The readiness of RNA sequences helps in the identification of D sites through computationally intelligent models. However, the most challenging part is turning these biological sequences into distinct vectors. Methods The current research proposed novel feature extraction mechanisms and the identification of D sites in tRNA sequences using ensemble models. The ensemble models were then subjected to evaluation using k-fold cross-validation and independent testing. Results The results revealed that the stacking ensemble model outperformed all the ensemble models by revealing 0.98 accuracy, 0.98 specificity, 0.97 sensitivity, and 0.92 Matthews Correlation Coefficient. The proposed model, iDHU-Ensem, was also compared with pre-existing predictors using an independent test. The accuracy scores have shown that the proposed model in this research study performed better than the available predictors. Conclusion The current research contributed towards the enhancement of D site identification capabilities through computationally intelligent methods. A web-based server, iDHU-Ensem, was also made available for the researchers at https://taseersuleman-idhu-ensem-idhu-ensem.streamlit.app/ .https://doi.org/10.1177/20552076231165963
spellingShingle Muhammad Taseer Suleman
Fahad Alturise
Tamim Alkhalifah
Yaser Daanial Khan
iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models
Digital Health
title iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models
title_full iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models
title_fullStr iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models
title_full_unstemmed iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models
title_short iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models
title_sort idhu ensem identification of dihydrouridine sites through ensemble learning models
url https://doi.org/10.1177/20552076231165963
work_keys_str_mv AT muhammadtaseersuleman idhuensemidentificationofdihydrouridinesitesthroughensemblelearningmodels
AT fahadalturise idhuensemidentificationofdihydrouridinesitesthroughensemblelearningmodels
AT tamimalkhalifah idhuensemidentificationofdihydrouridinesitesthroughensemblelearningmodels
AT yaserdaanialkhan idhuensemidentificationofdihydrouridinesitesthroughensemblelearningmodels