Structure-based whole-genome realignment reveals many novel noncoding RNAs

Recent genome-wide computational screens that search for conservation of RNA secondary structure in whole-genome alignments (WGAs) have predicted thousands of structural noncoding RNAs (ncRNAs). The sensitivity of such approaches, however, is limited, due to their reliance on sequence-based whole-ge...

Täydet tiedot

Bibliografiset tiedot
Päätekijät: Will, Sebastian, Berger, Bonnie, Yu, Michael
Muut tekijät: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Aineistotyyppi: Artikkeli
Kieli:en_US
Julkaistu: Cold Spring Harbor Laboratory Press 2013
Linkit:http://hdl.handle.net/1721.1/82919
https://orcid.org/0000-0002-2724-7228
_version_ 1826203609660915712
author Will, Sebastian
Berger, Bonnie
Yu, Michael
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Will, Sebastian
Berger, Bonnie
Yu, Michael
author_sort Will, Sebastian
collection MIT
description Recent genome-wide computational screens that search for conservation of RNA secondary structure in whole-genome alignments (WGAs) have predicted thousands of structural noncoding RNAs (ncRNAs). The sensitivity of such approaches, however, is limited, due to their reliance on sequence-based whole-genome aligners, which regularly misalign structural ncRNAs. This suggests that many more structural ncRNAs may remain undetected. Structure-based alignment, which could increase the sensitivity, has been prohibitive for genome-wide screens due to its extreme computational costs. Breaking this barrier, we present the pipeline REAPR (RE-Alignment for Prediction of structural ncRNA), which efficiently realigns whole genomes based on RNA sequence and structure, thus allowing us to boost the performance of de novo ncRNA predictors, such as RNAz. Key to the pipeline's efficiency is the development of a novel banding technique for multiple RNA alignment. REAPR significantly outperforms the widely used predictors RNAz and EvoFold in genome-wide screens; in direct comparison to the most recent RNAz screen on D. melanogaster, REAPR predicts twice as many high-confidence ncRNA candidates. Moreover, modENCODE RNA-seq experiments confirm a substantial number of its predictions as transcripts. REAPR's advancement of de novo structural characterization of ncRNAs complements the identification of transcripts from rapidly accumulating RNA-seq data.
first_indexed 2024-09-23T12:40:00Z
format Article
id mit-1721.1/82919
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T12:40:00Z
publishDate 2013
publisher Cold Spring Harbor Laboratory Press
record_format dspace
spelling mit-1721.1/829192022-09-28T09:18:19Z Structure-based whole-genome realignment reveals many novel noncoding RNAs Will, Sebastian Berger, Bonnie Yu, Michael Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Mathematics Will, Sebastian Berger, Bonnie Recent genome-wide computational screens that search for conservation of RNA secondary structure in whole-genome alignments (WGAs) have predicted thousands of structural noncoding RNAs (ncRNAs). The sensitivity of such approaches, however, is limited, due to their reliance on sequence-based whole-genome aligners, which regularly misalign structural ncRNAs. This suggests that many more structural ncRNAs may remain undetected. Structure-based alignment, which could increase the sensitivity, has been prohibitive for genome-wide screens due to its extreme computational costs. Breaking this barrier, we present the pipeline REAPR (RE-Alignment for Prediction of structural ncRNA), which efficiently realigns whole genomes based on RNA sequence and structure, thus allowing us to boost the performance of de novo ncRNA predictors, such as RNAz. Key to the pipeline's efficiency is the development of a novel banding technique for multiple RNA alignment. REAPR significantly outperforms the widely used predictors RNAz and EvoFold in genome-wide screens; in direct comparison to the most recent RNAz screen on D. melanogaster, REAPR predicts twice as many high-confidence ncRNA candidates. Moreover, modENCODE RNA-seq experiments confirm a substantial number of its predictions as transcripts. REAPR's advancement of de novo structural characterization of ncRNAs complements the identification of transcripts from rapidly accumulating RNA-seq data. National Institutes of Health (U.S.) (Grant RO1GM081871) 2013-12-13T15:29:41Z 2013-12-13T15:29:41Z 2013-01 2012-05 Article http://purl.org/eprint/type/JournalArticle 1088-9051 http://hdl.handle.net/1721.1/82919 Will, S., M. Yu, and B. Berger. “Structure-based whole-genome realignment reveals many novel noncoding RNAs.” Genome Research 23, no. 6 (June 1, 2013): 1018-1027. © 2013, Published by Cold Spring Harbor Laboratory Press https://orcid.org/0000-0002-2724-7228 en_US http://dx.doi.org/10.1101/gr.137091.111 Genome Research http://creativecommons.org/licenses/by-nc/3.0/ application/pdf Cold Spring Harbor Laboratory Press Cold Spring Harbor Laboratory Press
spellingShingle Will, Sebastian
Berger, Bonnie
Yu, Michael
Structure-based whole-genome realignment reveals many novel noncoding RNAs
title Structure-based whole-genome realignment reveals many novel noncoding RNAs
title_full Structure-based whole-genome realignment reveals many novel noncoding RNAs
title_fullStr Structure-based whole-genome realignment reveals many novel noncoding RNAs
title_full_unstemmed Structure-based whole-genome realignment reveals many novel noncoding RNAs
title_short Structure-based whole-genome realignment reveals many novel noncoding RNAs
title_sort structure based whole genome realignment reveals many novel noncoding rnas
url http://hdl.handle.net/1721.1/82919
https://orcid.org/0000-0002-2724-7228
work_keys_str_mv AT willsebastian structurebasedwholegenomerealignmentrevealsmanynovelnoncodingrnas
AT bergerbonnie structurebasedwholegenomerealignmentrevealsmanynovelnoncodingrnas
AT yumichael structurebasedwholegenomerealignmentrevealsmanynovelnoncodingrnas