Prediction of MicroRNA Precursors Using Parsimonious Feature Sets
MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate gene expression through base pairing with messenger RNAs. Due to the interest in studying miRNA dysregulation in disease and limits of validated miRNA references, identification of novel miRNAs is a critical task. The performance o...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2014-01-01
|
Series: | Cancer Informatics |
Online Access: | https://doi.org/10.4137/CIN.S13877 |
_version_ | 1818684123839987712 |
---|---|
author | Petra Stepanowsky Eric Levy Jihoon Kim Xiaoqian Jiang Lucila Ohno-Machado |
author_facet | Petra Stepanowsky Eric Levy Jihoon Kim Xiaoqian Jiang Lucila Ohno-Machado |
author_sort | Petra Stepanowsky |
collection | DOAJ |
description | MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate gene expression through base pairing with messenger RNAs. Due to the interest in studying miRNA dysregulation in disease and limits of validated miRNA references, identification of novel miRNAs is a critical task. The performance of different models to predict novel miRNAs varies with the features chosen as predictors. However, no study has systematically compared published feature sets. We constructed a comprehensive feature set using the minimum free energy of the secondary structure of precursor miRNAs, a set of nucleotide-structure triplets, and additional extracted sequence and structure characteristics. We then compared the predictive value of our comprehensive feature set to those from three previously published studies, using logistic regression and random forest classifiers. We found that classifiers containing as few as seven highly predictive features are able to predict novel precursor miRNAs as well as classifiers that use larger feature sets. In a real data set, our method correctly identified the holdout miRNAs relevant to renal cancer. |
first_indexed | 2024-12-17T10:45:38Z |
format | Article |
id | doaj.art-c1052aefecd84fdaa6fd305010abb499 |
institution | Directory Open Access Journal |
issn | 1176-9351 |
language | English |
last_indexed | 2024-12-17T10:45:38Z |
publishDate | 2014-01-01 |
publisher | SAGE Publishing |
record_format | Article |
series | Cancer Informatics |
spelling | doaj.art-c1052aefecd84fdaa6fd305010abb4992022-12-21T21:52:08ZengSAGE PublishingCancer Informatics1176-93512014-01-0113s110.4137/CIN.S13877Prediction of MicroRNA Precursors Using Parsimonious Feature SetsPetra Stepanowsky0Eric Levy1Jihoon Kim2Xiaoqian Jiang3Lucila Ohno-Machado4Bioinformatics Research Group, University of Applied Sciences, Upper Austria, Hagenberg, Austria.Division of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA.Division of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA.Division of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA.Division of Biomedical Informatics, University of California San Diego, La Jolla, CA, USA.MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate gene expression through base pairing with messenger RNAs. Due to the interest in studying miRNA dysregulation in disease and limits of validated miRNA references, identification of novel miRNAs is a critical task. The performance of different models to predict novel miRNAs varies with the features chosen as predictors. However, no study has systematically compared published feature sets. We constructed a comprehensive feature set using the minimum free energy of the secondary structure of precursor miRNAs, a set of nucleotide-structure triplets, and additional extracted sequence and structure characteristics. We then compared the predictive value of our comprehensive feature set to those from three previously published studies, using logistic regression and random forest classifiers. We found that classifiers containing as few as seven highly predictive features are able to predict novel precursor miRNAs as well as classifiers that use larger feature sets. In a real data set, our method correctly identified the holdout miRNAs relevant to renal cancer.https://doi.org/10.4137/CIN.S13877 |
spellingShingle | Petra Stepanowsky Eric Levy Jihoon Kim Xiaoqian Jiang Lucila Ohno-Machado Prediction of MicroRNA Precursors Using Parsimonious Feature Sets Cancer Informatics |
title | Prediction of MicroRNA Precursors Using Parsimonious Feature Sets |
title_full | Prediction of MicroRNA Precursors Using Parsimonious Feature Sets |
title_fullStr | Prediction of MicroRNA Precursors Using Parsimonious Feature Sets |
title_full_unstemmed | Prediction of MicroRNA Precursors Using Parsimonious Feature Sets |
title_short | Prediction of MicroRNA Precursors Using Parsimonious Feature Sets |
title_sort | prediction of microrna precursors using parsimonious feature sets |
url | https://doi.org/10.4137/CIN.S13877 |
work_keys_str_mv | AT petrastepanowsky predictionofmicrornaprecursorsusingparsimoniousfeaturesets AT ericlevy predictionofmicrornaprecursorsusingparsimoniousfeaturesets AT jihoonkim predictionofmicrornaprecursorsusingparsimoniousfeaturesets AT xiaoqianjiang predictionofmicrornaprecursorsusingparsimoniousfeaturesets AT lucilaohnomachado predictionofmicrornaprecursorsusingparsimoniousfeaturesets |