Evaluation of different reference based annotation strategies using RNA-Seq - a case study in Drososphila pseudoobscura.

RNA-Seq is a powerful tool for the annotation of genomes, in particular for the identification of isoforms and UTRs. Nevertheless, several software tools exist and no standard strategy to obtain a reliable annotation is yet established. We tested different combinations of the most commonly used refe...

Full description

Bibliographic Details
Main Authors: Nicola Palmieri, Viola Nolte, Anton Suvorov, Carolin Kosiol, Christian Schlötterer
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3463616?pdf=render
_version_ 1819051659173560320
author Nicola Palmieri
Viola Nolte
Anton Suvorov
Carolin Kosiol
Christian Schlötterer
author_facet Nicola Palmieri
Viola Nolte
Anton Suvorov
Carolin Kosiol
Christian Schlötterer
author_sort Nicola Palmieri
collection DOAJ
description RNA-Seq is a powerful tool for the annotation of genomes, in particular for the identification of isoforms and UTRs. Nevertheless, several software tools exist and no standard strategy to obtain a reliable annotation is yet established. We tested different combinations of the most commonly used reference-based alignment tools (TopHat, GSNAP) in combination with two frequently used reference-based assemblers (Cufflinks, Scripture) and evaluated the potential of RNA-Seq to improve the annotation of Drosophila pseudoobscura. While GSNAP maps a higher proportion of reads, TopHat resulted in a more accurate annotation when used in combination with Cufflinks. Scripture had the lowest sensitivity. Interestingly, after subsampling to the same coverage for GSNAP and TopHat, we find that both mappers have similar performance, implying that the advantage of TopHat is mainly an artifact of the lower coverage. Overall, we observed a low concordance among the different approaches tested both at junction and isoform levels. Using data from both sexes of two adult strains of D. pseudoobscura we detected alternative splicing for about 30% of the FlyBase multiple-exon genes. Moreover, we extended the boundaries for 6523 genes (about 40%). We annotated 669 new genes, 45% of them with splicing evidence. Most of the new genes are located on unassembled contigs, reflecting their incomplete annotation. Finally, we identified 99 additional new genes that are not represented in the current genome contigs of D. pseudoobscura, probably due to location in genomic regions that are difficult to assemble (e.g. heterochromatic regions).
first_indexed 2024-12-21T12:07:27Z
format Article
id doaj.art-ab1985be76744b88b4de56808322fc64
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-21T12:07:27Z
publishDate 2012-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-ab1985be76744b88b4de56808322fc642022-12-21T19:04:41ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-01710e4641510.1371/journal.pone.0046415Evaluation of different reference based annotation strategies using RNA-Seq - a case study in Drososphila pseudoobscura.Nicola PalmieriViola NolteAnton SuvorovCarolin KosiolChristian SchlöttererRNA-Seq is a powerful tool for the annotation of genomes, in particular for the identification of isoforms and UTRs. Nevertheless, several software tools exist and no standard strategy to obtain a reliable annotation is yet established. We tested different combinations of the most commonly used reference-based alignment tools (TopHat, GSNAP) in combination with two frequently used reference-based assemblers (Cufflinks, Scripture) and evaluated the potential of RNA-Seq to improve the annotation of Drosophila pseudoobscura. While GSNAP maps a higher proportion of reads, TopHat resulted in a more accurate annotation when used in combination with Cufflinks. Scripture had the lowest sensitivity. Interestingly, after subsampling to the same coverage for GSNAP and TopHat, we find that both mappers have similar performance, implying that the advantage of TopHat is mainly an artifact of the lower coverage. Overall, we observed a low concordance among the different approaches tested both at junction and isoform levels. Using data from both sexes of two adult strains of D. pseudoobscura we detected alternative splicing for about 30% of the FlyBase multiple-exon genes. Moreover, we extended the boundaries for 6523 genes (about 40%). We annotated 669 new genes, 45% of them with splicing evidence. Most of the new genes are located on unassembled contigs, reflecting their incomplete annotation. Finally, we identified 99 additional new genes that are not represented in the current genome contigs of D. pseudoobscura, probably due to location in genomic regions that are difficult to assemble (e.g. heterochromatic regions).http://europepmc.org/articles/PMC3463616?pdf=render
spellingShingle Nicola Palmieri
Viola Nolte
Anton Suvorov
Carolin Kosiol
Christian Schlötterer
Evaluation of different reference based annotation strategies using RNA-Seq - a case study in Drososphila pseudoobscura.
PLoS ONE
title Evaluation of different reference based annotation strategies using RNA-Seq - a case study in Drososphila pseudoobscura.
title_full Evaluation of different reference based annotation strategies using RNA-Seq - a case study in Drososphila pseudoobscura.
title_fullStr Evaluation of different reference based annotation strategies using RNA-Seq - a case study in Drososphila pseudoobscura.
title_full_unstemmed Evaluation of different reference based annotation strategies using RNA-Seq - a case study in Drososphila pseudoobscura.
title_short Evaluation of different reference based annotation strategies using RNA-Seq - a case study in Drososphila pseudoobscura.
title_sort evaluation of different reference based annotation strategies using rna seq a case study in drososphila pseudoobscura
url http://europepmc.org/articles/PMC3463616?pdf=render
work_keys_str_mv AT nicolapalmieri evaluationofdifferentreferencebasedannotationstrategiesusingrnaseqacasestudyindrososphilapseudoobscura
AT violanolte evaluationofdifferentreferencebasedannotationstrategiesusingrnaseqacasestudyindrososphilapseudoobscura
AT antonsuvorov evaluationofdifferentreferencebasedannotationstrategiesusingrnaseqacasestudyindrososphilapseudoobscura
AT carolinkosiol evaluationofdifferentreferencebasedannotationstrategiesusingrnaseqacasestudyindrososphilapseudoobscura
AT christianschlotterer evaluationofdifferentreferencebasedannotationstrategiesusingrnaseqacasestudyindrososphilapseudoobscura