Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs
To date, the main criterion by which long ncRNAs (lncRNAs) are discriminated from mRNAs is based on the capacity of the transcripts to encode a protein. However, it becomes important to identify non-ORF-based sequence characteristics that can be used to parse between ncRNAs and mRNAs. In this study,...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2014-09-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | http://journal.frontiersin.org/Journal/10.3389/fgene.2014.00316/full |
_version_ | 1811211521015939072 |
---|---|
author | Damien eUlveling Marcel E Dinger Claire eFrancastel Florent eHubé |
author_facet | Damien eUlveling Marcel E Dinger Claire eFrancastel Florent eHubé |
author_sort | Damien eUlveling |
collection | DOAJ |
description | To date, the main criterion by which long ncRNAs (lncRNAs) are discriminated from mRNAs is based on the capacity of the transcripts to encode a protein. However, it becomes important to identify non-ORF-based sequence characteristics that can be used to parse between ncRNAs and mRNAs. In this study, we first established an extremely selective workflow to define a highly refined database of lncRNAs which was used for comparison with mRNAs. Then using this highly selective collection of lncRNAs, we found the CG dinucleotide frequencies were clearly distinct. In addition, we showed that the bias in CG dinucleotide frequency was conserved in human and mouse genomes. We propose that this sequence feature will serve as a useful classifier in transcript classification pipelines. We also suggest that our refined database of ‘bona fide’ lncRNAs will be valuable for the discovery of other sequence characteristics distinct to lncRNAs. |
first_indexed | 2024-04-12T05:15:28Z |
format | Article |
id | doaj.art-fca375387c184f06952888cbc77bdb02 |
institution | Directory Open Access Journal |
issn | 1664-8021 |
language | English |
last_indexed | 2024-04-12T05:15:28Z |
publishDate | 2014-09-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Genetics |
spelling | doaj.art-fca375387c184f06952888cbc77bdb022022-12-22T03:46:39ZengFrontiers Media S.A.Frontiers in Genetics1664-80212014-09-01510.3389/fgene.2014.00316107374Identification of a dinucleotide signature that discriminates coding from non-coding long RNAsDamien eUlveling0Marcel E Dinger1Claire eFrancastel2Florent eHubé3UMR7216 Epigenetics and Cell FateThe University of Queensland Diamantina InstituteUMR7216 Epigenetics and Cell FateUMR7216 Epigenetics and Cell FateTo date, the main criterion by which long ncRNAs (lncRNAs) are discriminated from mRNAs is based on the capacity of the transcripts to encode a protein. However, it becomes important to identify non-ORF-based sequence characteristics that can be used to parse between ncRNAs and mRNAs. In this study, we first established an extremely selective workflow to define a highly refined database of lncRNAs which was used for comparison with mRNAs. Then using this highly selective collection of lncRNAs, we found the CG dinucleotide frequencies were clearly distinct. In addition, we showed that the bias in CG dinucleotide frequency was conserved in human and mouse genomes. We propose that this sequence feature will serve as a useful classifier in transcript classification pipelines. We also suggest that our refined database of ‘bona fide’ lncRNAs will be valuable for the discovery of other sequence characteristics distinct to lncRNAs.http://journal.frontiersin.org/Journal/10.3389/fgene.2014.00316/fulldatabasemRNAncRNAexonpseudogeneintron |
spellingShingle | Damien eUlveling Marcel E Dinger Claire eFrancastel Florent eHubé Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs Frontiers in Genetics database mRNA ncRNA exon pseudogene intron |
title | Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs |
title_full | Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs |
title_fullStr | Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs |
title_full_unstemmed | Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs |
title_short | Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs |
title_sort | identification of a dinucleotide signature that discriminates coding from non coding long rnas |
topic | database mRNA ncRNA exon pseudogene intron |
url | http://journal.frontiersin.org/Journal/10.3389/fgene.2014.00316/full |
work_keys_str_mv | AT damieneulveling identificationofadinucleotidesignaturethatdiscriminatescodingfromnoncodinglongrnas AT marceledinger identificationofadinucleotidesignaturethatdiscriminatescodingfromnoncodinglongrnas AT claireefrancastel identificationofadinucleotidesignaturethatdiscriminatescodingfromnoncodinglongrnas AT florentehube identificationofadinucleotidesignaturethatdiscriminatescodingfromnoncodinglongrnas |