Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity

Abstract Background Cancers exhibit complex transcriptomes with aberrant splicing that induces isoform-level differential expression compared to non-diseased tissues. Transcriptomic profiling using short-read sequencing has utility in providing a cost-effective approach for evaluating isoform expres...

Full description

Bibliographic Details
Main Authors:	Saranga Wijeratne, Maria E. Hernandez Gonzalez, Kelli Roach, Katherine E. Miller, Kathleen M. Schieffer, James R. Fitch, Jeffrey Leonard, Peter White, Benjamin J. Kelly, Catherine E. Cottrell, Elaine R. Mardis, Richard K. Wilson, Anthony R. Miller
Format:	Article
Language:	English
Published:	BMC 2024-01-01
Series:	BMC Genomics
Subjects:	Long-read RNA sequencing Concatenation Isoform discovery Tumor transcriptome
Online Access:	https://doi.org/10.1186/s12864-024-10021-x

_version_	1827328511911657472
author	Saranga Wijeratne Maria E. Hernandez Gonzalez Kelli Roach Katherine E. Miller Kathleen M. Schieffer James R. Fitch Jeffrey Leonard Peter White Benjamin J. Kelly Catherine E. Cottrell Elaine R. Mardis Richard K. Wilson Anthony R. Miller
author_facet	Saranga Wijeratne Maria E. Hernandez Gonzalez Kelli Roach Katherine E. Miller Kathleen M. Schieffer James R. Fitch Jeffrey Leonard Peter White Benjamin J. Kelly Catherine E. Cottrell Elaine R. Mardis Richard K. Wilson Anthony R. Miller
author_sort	Saranga Wijeratne
collection	DOAJ
description	Abstract Background Cancers exhibit complex transcriptomes with aberrant splicing that induces isoform-level differential expression compared to non-diseased tissues. Transcriptomic profiling using short-read sequencing has utility in providing a cost-effective approach for evaluating isoform expression, although short-read assembly displays limitations in the accurate inference of full-length transcripts. Long-read RNA sequencing (Iso-Seq), using the Pacific Biosciences (PacBio) platform, can overcome such limitations by providing full-length isoform sequence resolution which requires no read assembly and represents native expressed transcripts. A constraint of the Iso-Seq protocol is due to fewer reads output per instrument run, which, as an example, can consequently affect the detection of lowly expressed transcripts. To address these deficiencies, we developed a concatenation workflow, PacBio Full-Length Isoform Concatemer Sequencing (PB_FLIC-Seq), designed to increase the number of unique, sequenced PacBio long-reads thereby improving overall detection of unique isoforms. In addition, we anticipate that the increase in read depth will help improve the detection of moderate to low-level expressed isoforms. Results In sequencing a commercial reference (Spike-In RNA Variants; SIRV) with known isoform complexity we demonstrated a 3.4-fold increase in read output per run and improved SIRV recall when using the PB_FLIC-Seq method compared to the same samples processed with the Iso-Seq protocol. We applied this protocol to a translational cancer case, also demonstrating the utility of the PB_FLIC-Seq method for identifying differential full-length isoform expression in a pediatric diffuse midline glioma compared to its adjacent non-malignant tissue. Our data analysis revealed increased expression of extracellular matrix (ECM) genes within the tumor sample, including an isoform of the Secreted Protein Acidic and Cysteine Rich (SPARC) gene that was expressed 11,676-fold higher than in the adjacent non-malignant tissue. Finally, by using the PB_FLIC-Seq method, we detected several cancer-specific novel isoforms. Conclusion This work describes a concatenation-based methodology for increasing the number of sequenced full-length isoform reads on the PacBio platform, yielding improved discovery of expressed isoforms. We applied this workflow to profile the transcriptome of a pediatric diffuse midline glioma and adjacent non-malignant tissue. Our findings of cancer-specific novel isoform expression further highlight the importance of long-read sequencing for characterization of complex tumor transcriptomes.
first_indexed	2024-03-07T15:18:57Z
format	Article
id	doaj.art-64402f5c949841d693df43c099c66531
institution	Directory Open Access Journal
issn	1471-2164
language	English
last_indexed	2024-03-07T15:18:57Z
publishDate	2024-01-01
publisher	BMC
record_format	Article
series	BMC Genomics
spelling	doaj.art-64402f5c949841d693df43c099c665312024-03-05T17:46:19ZengBMCBMC Genomics1471-21642024-01-0125111910.1186/s12864-024-10021-xFull-length isoform concatenation sequencing to resolve cancer transcriptome complexitySaranga Wijeratne0Maria E. Hernandez Gonzalez1Kelli Roach2Katherine E. Miller3Kathleen M. Schieffer4James R. Fitch5Jeffrey Leonard6Peter White7Benjamin J. Kelly8Catherine E. Cottrell9Elaine R. Mardis10Richard K. Wilson11Anthony R. Miller12The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalDepartment of Neurosurgery, Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalThe Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children’s HospitalAbstract Background Cancers exhibit complex transcriptomes with aberrant splicing that induces isoform-level differential expression compared to non-diseased tissues. Transcriptomic profiling using short-read sequencing has utility in providing a cost-effective approach for evaluating isoform expression, although short-read assembly displays limitations in the accurate inference of full-length transcripts. Long-read RNA sequencing (Iso-Seq), using the Pacific Biosciences (PacBio) platform, can overcome such limitations by providing full-length isoform sequence resolution which requires no read assembly and represents native expressed transcripts. A constraint of the Iso-Seq protocol is due to fewer reads output per instrument run, which, as an example, can consequently affect the detection of lowly expressed transcripts. To address these deficiencies, we developed a concatenation workflow, PacBio Full-Length Isoform Concatemer Sequencing (PB_FLIC-Seq), designed to increase the number of unique, sequenced PacBio long-reads thereby improving overall detection of unique isoforms. In addition, we anticipate that the increase in read depth will help improve the detection of moderate to low-level expressed isoforms. Results In sequencing a commercial reference (Spike-In RNA Variants; SIRV) with known isoform complexity we demonstrated a 3.4-fold increase in read output per run and improved SIRV recall when using the PB_FLIC-Seq method compared to the same samples processed with the Iso-Seq protocol. We applied this protocol to a translational cancer case, also demonstrating the utility of the PB_FLIC-Seq method for identifying differential full-length isoform expression in a pediatric diffuse midline glioma compared to its adjacent non-malignant tissue. Our data analysis revealed increased expression of extracellular matrix (ECM) genes within the tumor sample, including an isoform of the Secreted Protein Acidic and Cysteine Rich (SPARC) gene that was expressed 11,676-fold higher than in the adjacent non-malignant tissue. Finally, by using the PB_FLIC-Seq method, we detected several cancer-specific novel isoforms. Conclusion This work describes a concatenation-based methodology for increasing the number of sequenced full-length isoform reads on the PacBio platform, yielding improved discovery of expressed isoforms. We applied this workflow to profile the transcriptome of a pediatric diffuse midline glioma and adjacent non-malignant tissue. Our findings of cancer-specific novel isoform expression further highlight the importance of long-read sequencing for characterization of complex tumor transcriptomes.https://doi.org/10.1186/s12864-024-10021-xLong-read RNA sequencingConcatenationIsoform discoveryTumor transcriptome
spellingShingle	Saranga Wijeratne Maria E. Hernandez Gonzalez Kelli Roach Katherine E. Miller Kathleen M. Schieffer James R. Fitch Jeffrey Leonard Peter White Benjamin J. Kelly Catherine E. Cottrell Elaine R. Mardis Richard K. Wilson Anthony R. Miller Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity BMC Genomics Long-read RNA sequencing Concatenation Isoform discovery Tumor transcriptome
title	Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity
title_full	Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity
title_fullStr	Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity
title_full_unstemmed	Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity
title_short	Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity
title_sort	full length isoform concatenation sequencing to resolve cancer transcriptome complexity
topic	Long-read RNA sequencing Concatenation Isoform discovery Tumor transcriptome
url	https://doi.org/10.1186/s12864-024-10021-x
work_keys_str_mv	AT sarangawijeratne fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT mariaehernandezgonzalez fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT kelliroach fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT katherineemiller fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT kathleenmschieffer fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT jamesrfitch fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT jeffreyleonard fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT peterwhite fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT benjaminjkelly fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT catherineecottrell fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT elainermardis fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT richardkwilson fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity AT anthonyrmiller fulllengthisoformconcatenationsequencingtoresolvecancertranscriptomecomplexity

Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity

Similar Items