An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>
Identification of microRNAs is important in studies of regulation of gene expression in many biologyical processes. In this study, we developed an improved method for identification of microRNAs in Drosophila. We used the iLearn, PyFeat, and Pse-in-One methods to extract the features and then used M...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9036972/ |
_version_ | 1818735840106381312 |
---|---|
author | Tieying Yu Min Chen Chunde Wang |
author_facet | Tieying Yu Min Chen Chunde Wang |
author_sort | Tieying Yu |
collection | DOAJ |
description | Identification of microRNAs is important in studies of regulation of gene expression in many biologyical processes. In this study, we developed an improved method for identification of microRNAs in Drosophila. We used the iLearn, PyFeat, and Pse-in-One methods to extract the features and then used Max-Relevance-Max-Distance (MRMD2.0) and t-Distributed Stochastic Neighbour Embedding (t-SNE) to reduce dimension of the features and the random forest classifier in Weka to identify miRNAs. With this method, we found that the discriminative features for identification of pre-miRNAs were, in Drosophila melanogaster, the occurrences of G_GUG and C_AGU when the value of the feature vector was greater than 2, and in Drosophila pseudoobscura, the 4-tuple nucleotide composition and the occurrence of 4-length neighbouring nucleic acids when the value of the feature vector was less than 0.02. These vectors covered all compositional information or the frequency of bases. Classification results showed the classification accuracy was 95.7% and 93.6%, the precision rate was 95.8% and 93.6%, and the recall rate was 95.7% and 93.6% in Drosophila melanogaster and Drosophila pseudoobscura, respectively, which are higher than those reported in previous studies. |
first_indexed | 2024-12-18T00:27:39Z |
format | Article |
id | doaj.art-fb6f8856e1e842d39095ba123a758f85 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-18T00:27:39Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-fb6f8856e1e842d39095ba123a758f852022-12-21T21:27:12ZengIEEEIEEE Access2169-35362020-01-018521735218010.1109/ACCESS.2020.29808979036972An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic>Tieying Yu0https://orcid.org/0000-0001-9626-6628Min Chen1Chunde Wang2Chinese Academy of Sciences, Yantai Institute of Coastal Zone Research, Yantai, ChinaChinese Academy of Sciences, Yantai Institute of Coastal Zone Research, Yantai, ChinaChinese Academy of Sciences, Yantai Institute of Coastal Zone Research, Yantai, ChinaIdentification of microRNAs is important in studies of regulation of gene expression in many biologyical processes. In this study, we developed an improved method for identification of microRNAs in Drosophila. We used the iLearn, PyFeat, and Pse-in-One methods to extract the features and then used Max-Relevance-Max-Distance (MRMD2.0) and t-Distributed Stochastic Neighbour Embedding (t-SNE) to reduce dimension of the features and the random forest classifier in Weka to identify miRNAs. With this method, we found that the discriminative features for identification of pre-miRNAs were, in Drosophila melanogaster, the occurrences of G_GUG and C_AGU when the value of the feature vector was greater than 2, and in Drosophila pseudoobscura, the 4-tuple nucleotide composition and the occurrence of 4-length neighbouring nucleic acids when the value of the feature vector was less than 0.02. These vectors covered all compositional information or the frequency of bases. Classification results showed the classification accuracy was 95.7% and 93.6%, the precision rate was 95.8% and 93.6%, and the recall rate was 95.7% and 93.6% in Drosophila melanogaster and Drosophila pseudoobscura, respectively, which are higher than those reported in previous studies.https://ieeexplore.ieee.org/document/9036972/microRNAiLearnPyFeatPse-in-OneMRMD2.0t-SNE |
spellingShingle | Tieying Yu Min Chen Chunde Wang An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic> IEEE Access microRNA iLearn PyFeat Pse-in-One MRMD2.0 t-SNE |
title | An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic> |
title_full | An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic> |
title_fullStr | An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic> |
title_full_unstemmed | An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic> |
title_short | An Improved Method for Identification of Pre-miRNA in <italic>Drosophila</italic> |
title_sort | improved method for identification of pre mirna in italic drosophila italic |
topic | microRNA iLearn PyFeat Pse-in-One MRMD2.0 t-SNE |
url | https://ieeexplore.ieee.org/document/9036972/ |
work_keys_str_mv | AT tieyingyu animprovedmethodforidentificationofpremirnainitalicdrosophilaitalic AT minchen animprovedmethodforidentificationofpremirnainitalicdrosophilaitalic AT chundewang animprovedmethodforidentificationofpremirnainitalicdrosophilaitalic AT tieyingyu improvedmethodforidentificationofpremirnainitalicdrosophilaitalic AT minchen improvedmethodforidentificationofpremirnainitalicdrosophilaitalic AT chundewang improvedmethodforidentificationofpremirnainitalicdrosophilaitalic |