ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m<sub>1</sub>A RNA Methylation Sites

As a new epitranscriptomic modification, N1-methyladenosine (m<sup>1</sup>A) plays an important role in the gene expression regulation. Although some computational methods were proposed to predict m<sup>1</sup>A modification sites, all of these methods apply machine learning...

Full description

Bibliographic Details
Main Authors: Lian Liu, Xiujuan Lei, Jia Meng, Zhen Wei
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9079809/
_version_ 1818619430544867328
author Lian Liu
Xiujuan Lei
Jia Meng
Zhen Wei
author_facet Lian Liu
Xiujuan Lei
Jia Meng
Zhen Wei
author_sort Lian Liu
collection DOAJ
description As a new epitranscriptomic modification, N1-methyladenosine (m<sup>1</sup>A) plays an important role in the gene expression regulation. Although some computational methods were proposed to predict m<sup>1</sup>A modification sites, all of these methods apply machine learning predictions based on the nucleotide sequence features, and they missed the layer of information in transcript topology and RNA secondary structures. To enhance the prediction model of m<sup>1</sup>A RNA methylation, we proposed a computational framework, ISGm1A, which stands for integration sequence features and genomic features to improve the prediction of human m<sup>1</sup>A RNA methylation sites. Based on the random forest algorithm, ISGm1A takes advantage of both conventional sequence features and 75 genomic characteristics to improve the prediction performance of m<sup>1</sup>A sites in human. The results of five-fold cross validation and independent test show that ISGm1A outperforms other prediction algorithms (AUC = 0.903 and 0.909). In addition, through analyzing the importance of features, we found that the genomic features play a more important role in site prediction than the sequence features. Furthermore, with ISGm1A, we generated a high accuracy map of m<sup>1</sup>A by predicting all adenines sites in the transcriptome. The data and the results of the study are freely accessible at: https://github.com/lianliu09/m1a_prediction.git.
first_indexed 2024-12-16T17:37:22Z
format Article
id doaj.art-a135bd2ac0064f318e0aa442faadf090
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-16T17:37:22Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-a135bd2ac0064f318e0aa442faadf0902022-12-21T22:22:42ZengIEEEIEEE Access2169-35362020-01-018819718197710.1109/ACCESS.2020.29910709079809ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m<sub>1</sub>A RNA Methylation SitesLian Liu0https://orcid.org/0000-0001-5778-5230Xiujuan Lei1https://orcid.org/0000-0002-9901-1732Jia Meng2Zhen Wei3School of Computer Science, Shaanxi Normal University, Xi&#x2019;an, ChinaSchool of Computer Science, Shaanxi Normal University, Xi&#x2019;an, ChinaDepartment of Biological Sciences, Xi&#x2019;an Jiaotong-Liverpool University, Suzhou, ChinaDepartment of Biological Sciences, Xi&#x2019;an Jiaotong-Liverpool University, Suzhou, ChinaAs a new epitranscriptomic modification, N1-methyladenosine (m<sup>1</sup>A) plays an important role in the gene expression regulation. Although some computational methods were proposed to predict m<sup>1</sup>A modification sites, all of these methods apply machine learning predictions based on the nucleotide sequence features, and they missed the layer of information in transcript topology and RNA secondary structures. To enhance the prediction model of m<sup>1</sup>A RNA methylation, we proposed a computational framework, ISGm1A, which stands for integration sequence features and genomic features to improve the prediction of human m<sup>1</sup>A RNA methylation sites. Based on the random forest algorithm, ISGm1A takes advantage of both conventional sequence features and 75 genomic characteristics to improve the prediction performance of m<sup>1</sup>A sites in human. The results of five-fold cross validation and independent test show that ISGm1A outperforms other prediction algorithms (AUC = 0.903 and 0.909). In addition, through analyzing the importance of features, we found that the genomic features play a more important role in site prediction than the sequence features. Furthermore, with ISGm1A, we generated a high accuracy map of m<sup>1</sup>A by predicting all adenines sites in the transcriptome. The data and the results of the study are freely accessible at: https://github.com/lianliu09/m1a_prediction.git.https://ieeexplore.ieee.org/document/9079809/Epitranscriptomem¹Asite predictionsequence featuresgenomic features
spellingShingle Lian Liu
Xiujuan Lei
Jia Meng
Zhen Wei
ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m<sub>1</sub>A RNA Methylation Sites
IEEE Access
Epitranscriptome
m¹A
site prediction
sequence features
genomic features
title ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m<sub>1</sub>A RNA Methylation Sites
title_full ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m<sub>1</sub>A RNA Methylation Sites
title_fullStr ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m<sub>1</sub>A RNA Methylation Sites
title_full_unstemmed ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m<sub>1</sub>A RNA Methylation Sites
title_short ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m<sub>1</sub>A RNA Methylation Sites
title_sort isgm1a integration of sequence features and genomic features to improve the prediction of human m sub 1 sub a rna methylation sites
topic Epitranscriptome
m¹A
site prediction
sequence features
genomic features
url https://ieeexplore.ieee.org/document/9079809/
work_keys_str_mv AT lianliu isgm1aintegrationofsequencefeaturesandgenomicfeaturestoimprovethepredictionofhumanmsub1subarnamethylationsites
AT xiujuanlei isgm1aintegrationofsequencefeaturesandgenomicfeaturestoimprovethepredictionofhumanmsub1subarnamethylationsites
AT jiameng isgm1aintegrationofsequencefeaturesandgenomicfeaturestoimprovethepredictionofhumanmsub1subarnamethylationsites
AT zhenwei isgm1aintegrationofsequencefeaturesandgenomicfeaturestoimprovethepredictionofhumanmsub1subarnamethylationsites