Systematic curation of miRBase annotation using integrated small RNA high-throughput sequencing data for C. elegans and Drosophila

MicroRNAs (miRNAs) are a class of 20 to 23 nucleotide small RNAs that regulate gene expression post-transcriptionally in animals and plants. Annotation of miRNAs by the miRBase database has largely relied on computational approaches. As a result, many miRBase entries lack experimental validation, an...

Full description

Bibliographic Details
Main Authors: Xiangfeng eWang, Shirley eLiu
Format: Article
Language:English
Published: Frontiers Media S.A. 2011-05-01
Series:Frontiers in Genetics
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fgene.2011.00025/full
_version_ 1818036503208525824
author Xiangfeng eWang
Xiangfeng eWang
Shirley eLiu
author_facet Xiangfeng eWang
Xiangfeng eWang
Shirley eLiu
author_sort Xiangfeng eWang
collection DOAJ
description MicroRNAs (miRNAs) are a class of 20 to 23 nucleotide small RNAs that regulate gene expression post-transcriptionally in animals and plants. Annotation of miRNAs by the miRBase database has largely relied on computational approaches. As a result, many miRBase entries lack experimental validation, and discrepancies between miRBase annotation and actual miRNA sequences are often observed. In this study, we integrated the small RNA sequencing (smRNA-seq) datasets in Caenorhabditis elegans and Drosophila melanogaster and devised an analytical pipeline coupled with detailed manual inspection to curate miRNA annotation systematically in miRBase. Our analysis reveals 19 (17.0%) and 51 (31.3%) miRNAs entries with detectable smRNA-seq reads have mature sequence discrepancies in C. elegans and D. melanogaster, respectively. These discrepancies frequently occur either for conserved miRNA families whose mature sequences were predicted according to their homologous counterparts in other species or for miRNAs whose precursor miRNA (pre-miRNA) hairpins produce an abundance of multiple miRNA isoforms or variants. Our analysis shows that while Drosophila pre-miRNAs, on average, produce less than 60% accurate mature miRNA reads in addition to their 5’ and 3’ variant isoforms, the precision of miRNA processing in C. elegans is much higher, at over 90%. Based on the revised miRNA sequences, we analyzed expression patterns of the more conserved (MC) and less conserved (LC) miRNAs and found that, whereas MC miRNAs are often co-expressed at multiple developmental stages, LC miRNAs tend to be expressed specifically at fewer stages.
first_indexed 2024-12-10T07:11:59Z
format Article
id doaj.art-41dd387dc65d43e29f83b18ccb2c63e6
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-10T07:11:59Z
publishDate 2011-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-41dd387dc65d43e29f83b18ccb2c63e62022-12-22T01:58:02ZengFrontiers Media S.A.Frontiers in Genetics1664-80212011-05-01210.3389/fgene.2011.0002511107Systematic curation of miRBase annotation using integrated small RNA high-throughput sequencing data for C. elegans and DrosophilaXiangfeng eWang0Xiangfeng eWang1Shirley eLiu2 Dana-Farber Cancer InstituteUniversity of Arizona Dana-Farber Cancer InstituteMicroRNAs (miRNAs) are a class of 20 to 23 nucleotide small RNAs that regulate gene expression post-transcriptionally in animals and plants. Annotation of miRNAs by the miRBase database has largely relied on computational approaches. As a result, many miRBase entries lack experimental validation, and discrepancies between miRBase annotation and actual miRNA sequences are often observed. In this study, we integrated the small RNA sequencing (smRNA-seq) datasets in Caenorhabditis elegans and Drosophila melanogaster and devised an analytical pipeline coupled with detailed manual inspection to curate miRNA annotation systematically in miRBase. Our analysis reveals 19 (17.0%) and 51 (31.3%) miRNAs entries with detectable smRNA-seq reads have mature sequence discrepancies in C. elegans and D. melanogaster, respectively. These discrepancies frequently occur either for conserved miRNA families whose mature sequences were predicted according to their homologous counterparts in other species or for miRNAs whose precursor miRNA (pre-miRNA) hairpins produce an abundance of multiple miRNA isoforms or variants. Our analysis shows that while Drosophila pre-miRNAs, on average, produce less than 60% accurate mature miRNA reads in addition to their 5’ and 3’ variant isoforms, the precision of miRNA processing in C. elegans is much higher, at over 90%. Based on the revised miRNA sequences, we analyzed expression patterns of the more conserved (MC) and less conserved (LC) miRNAs and found that, whereas MC miRNAs are often co-expressed at multiple developmental stages, LC miRNAs tend to be expressed specifically at fewer stages.http://journal.frontiersin.org/Journal/10.3389/fgene.2011.00025/fullmicroRNADeep sequencingmiRBase curation
spellingShingle Xiangfeng eWang
Xiangfeng eWang
Shirley eLiu
Systematic curation of miRBase annotation using integrated small RNA high-throughput sequencing data for C. elegans and Drosophila
Frontiers in Genetics
microRNA
Deep sequencing
miRBase curation
title Systematic curation of miRBase annotation using integrated small RNA high-throughput sequencing data for C. elegans and Drosophila
title_full Systematic curation of miRBase annotation using integrated small RNA high-throughput sequencing data for C. elegans and Drosophila
title_fullStr Systematic curation of miRBase annotation using integrated small RNA high-throughput sequencing data for C. elegans and Drosophila
title_full_unstemmed Systematic curation of miRBase annotation using integrated small RNA high-throughput sequencing data for C. elegans and Drosophila
title_short Systematic curation of miRBase annotation using integrated small RNA high-throughput sequencing data for C. elegans and Drosophila
title_sort systematic curation of mirbase annotation using integrated small rna high throughput sequencing data for c elegans and drosophila
topic microRNA
Deep sequencing
miRBase curation
url http://journal.frontiersin.org/Journal/10.3389/fgene.2011.00025/full
work_keys_str_mv AT xiangfengewang systematiccurationofmirbaseannotationusingintegratedsmallrnahighthroughputsequencingdataforcelegansanddrosophila
AT xiangfengewang systematiccurationofmirbaseannotationusingintegratedsmallrnahighthroughputsequencingdataforcelegansanddrosophila
AT shirleyeliu systematiccurationofmirbaseannotationusingintegratedsmallrnahighthroughputsequencingdataforcelegansanddrosophila