Summary: | The identification of gene fusions promises to play an important role in personalized cancer treatment decisions. Many rare gene fusion events have been identified in fresh frozen solid tumors from common cancers employing next-generation sequencing technology. However the ability to detect transcripts from gene fusions in RNA isolated from formalin-fixed paraffin-embedded (FFPE) tumor tissues, which exist in very large sample repositories for which disease outcome is known, is still limited due to the low complexity of FFPE libraries and the lack of appropriate bioinformatics methods. We sought to develop a bioinformatics method, named gFuse, to detect fusion transcripts in FFPE tumor tissues. An integrated, cohort based strategy has been used in gFuse to examine single-end 50 base pair (bp) reads generated from FFPE RNA-Sequencing (RNA-Seq) datasets employing two breast cancer cohorts of 136 and 76 patients. In total, 118 fusion events were detected transcriptome-wide at base-pair resolution across the 212 samples. We selected 77 candidate fusions based on their biological relevance to cancer and supported 61% of these using TaqMan assays. Direct sequencing of 19 of the fusion sequences identified by TaqMan confirmed them. Three unique fused gene pairs were recurrent across the 212 patients with 6, 3, 2 individuals harboring these fusions respectively. We show here that a high frequency of fusion transcripts detected at the whole transcriptome level correlates with poor outcome (P<0.0005) in human breast cancer patients. This study demonstrates the ability to detect fusion transcripts as biomarkers from archival FFPE tissues, and the potential prognostic value of the fusion transcripts detected.
|