An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>

Identification of microRNAs is important in studies of regulation of gene expression in many biologyical processes. In this study, we developed an improved method for identification of microRNAs in Drosophila. We used the iLearn, PyFeat, and Pse-in-One methods to extract the features and then used M...

Full description

Bibliographic Details
Main Authors: Tieying Yu, Min Chen, Chunde Wang
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9036972/
_version_ 1818735840106381312
author Tieying Yu
Min Chen
Chunde Wang
author_facet Tieying Yu
Min Chen
Chunde Wang
author_sort Tieying Yu
collection DOAJ
description Identification of microRNAs is important in studies of regulation of gene expression in many biologyical processes. In this study, we developed an improved method for identification of microRNAs in Drosophila. We used the iLearn, PyFeat, and Pse-in-One methods to extract the features and then used Max-Relevance-Max-Distance (MRMD2.0) and t-Distributed Stochastic Neighbour Embedding (t-SNE) to reduce dimension of the features and the random forest classifier in Weka to identify miRNAs. With this method, we found that the discriminative features for identification of pre-miRNAs were, in Drosophila melanogaster, the occurrences of G_GUG and C_AGU when the value of the feature vector was greater than 2, and in Drosophila pseudoobscura, the 4-tuple nucleotide composition and the occurrence of 4-length neighbouring nucleic acids when the value of the feature vector was less than 0.02. These vectors covered all compositional information or the frequency of bases. Classification results showed the classification accuracy was 95.7% and 93.6%, the precision rate was 95.8% and 93.6%, and the recall rate was 95.7% and 93.6% in Drosophila melanogaster and Drosophila pseudoobscura, respectively, which are higher than those reported in previous studies.
first_indexed 2024-12-18T00:27:39Z
format Article
id doaj.art-fb6f8856e1e842d39095ba123a758f85
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-18T00:27:39Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-fb6f8856e1e842d39095ba123a758f852022-12-21T21:27:12ZengIEEEIEEE Access2169-35362020-01-018521735218010.1109/ACCESS.2020.29808979036972An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>Tieying Yu0https://orcid.org/0000-0001-9626-6628Min Chen1Chunde Wang2Chinese Academy of Sciences, Yantai Institute of Coastal Zone Research, Yantai, ChinaChinese Academy of Sciences, Yantai Institute of Coastal Zone Research, Yantai, ChinaChinese Academy of Sciences, Yantai Institute of Coastal Zone Research, Yantai, ChinaIdentification of microRNAs is important in studies of regulation of gene expression in many biologyical processes. In this study, we developed an improved method for identification of microRNAs in Drosophila. We used the iLearn, PyFeat, and Pse-in-One methods to extract the features and then used Max-Relevance-Max-Distance (MRMD2.0) and t-Distributed Stochastic Neighbour Embedding (t-SNE) to reduce dimension of the features and the random forest classifier in Weka to identify miRNAs. With this method, we found that the discriminative features for identification of pre-miRNAs were, in Drosophila melanogaster, the occurrences of G_GUG and C_AGU when the value of the feature vector was greater than 2, and in Drosophila pseudoobscura, the 4-tuple nucleotide composition and the occurrence of 4-length neighbouring nucleic acids when the value of the feature vector was less than 0.02. These vectors covered all compositional information or the frequency of bases. Classification results showed the classification accuracy was 95.7% and 93.6%, the precision rate was 95.8% and 93.6%, and the recall rate was 95.7% and 93.6% in Drosophila melanogaster and Drosophila pseudoobscura, respectively, which are higher than those reported in previous studies.https://ieeexplore.ieee.org/document/9036972/microRNAiLearnPyFeatPse-in-OneMRMD2.0t-SNE
spellingShingle Tieying Yu
Min Chen
Chunde Wang
An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>
IEEE Access
microRNA
iLearn
PyFeat
Pse-in-One
MRMD2.0
t-SNE
title An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>
title_full An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>
title_fullStr An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>
title_full_unstemmed An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>
title_short An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>
title_sort improved method for identification of pre mirna in italic drosophila italic
topic microRNA
iLearn
PyFeat
Pse-in-One
MRMD2.0
t-SNE
url https://ieeexplore.ieee.org/document/9036972/
work_keys_str_mv AT tieyingyu animprovedmethodforidentificationofpremirnainitalicdrosophilaitalic
AT minchen animprovedmethodforidentificationofpremirnainitalicdrosophilaitalic
AT chundewang animprovedmethodforidentificationofpremirnainitalicdrosophilaitalic
AT tieyingyu improvedmethodforidentificationofpremirnainitalicdrosophilaitalic
AT minchen improvedmethodforidentificationofpremirnainitalicdrosophilaitalic
AT chundewang improvedmethodforidentificationofpremirnainitalicdrosophilaitalic