Barnacle: detecting and characterizing tandem duplications and fusions in transcriptome assemblies

Background: Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers. Results: We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo asse...

Full description

Bibliographic Details
Main Authors: Swanson, Lucas, Robertson, Gordon, Mungall, Karen L., Butterfield, Yaron S., Chiu, Readman, Corbett, Richard D., Docking, T. R., Hogge, Donna, Jackman, Shaun D., Moore, Richard A., Mungall, Andrew J., Nip, Ka Ming, Parker, Jeremy D. K., Qian, Jenny Q., Raymond, Anthony, Sung, Sandy, Tam, Angela, Thiessen, Nina, Varhol, Richard, Wang, Sherry, Yorukoglu, Deniz, Zhao, YongJun, Hoodless, Pamela A., Sahinalp, S. C., Karsan, Aly, Birol, Inanc, Qian, Jenny, Sahinalp, S.
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: BioMed Central Ltd. 2013
Online Access:http://hdl.handle.net/1721.1/81361
https://orcid.org/0000-0003-2315-0768
Description
Summary:Background: Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers. Results: We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets. Conclusions: Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases.