Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads

Abstract Background There is growing interest in retained introns in a variety of disease contexts including cancer and aging. Many software tools have been developed to detect retained introns from short RNA-seq reads, but reliable detection is complicated by overlapping genes and transcripts as we...

Full description

Bibliographic Details
Main Authors: Julianne K. David, Sean K. Maden, Mary A. Wood, Reid F. Thompson, Abhinav Nellore
Format: Article
Language:English
Published: BMC 2022-11-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-022-02789-6
_version_ 1828098542783168512
author Julianne K. David
Sean K. Maden
Mary A. Wood
Reid F. Thompson
Abhinav Nellore
author_facet Julianne K. David
Sean K. Maden
Mary A. Wood
Reid F. Thompson
Abhinav Nellore
author_sort Julianne K. David
collection DOAJ
description Abstract Background There is growing interest in retained introns in a variety of disease contexts including cancer and aging. Many software tools have been developed to detect retained introns from short RNA-seq reads, but reliable detection is complicated by overlapping genes and transcripts as well as the presence of unprocessed or partially processed RNAs. Results We compared introns detected by 8 tools using short RNA-seq reads with introns observed in long RNA-seq reads from the same biological specimens. We found significant disagreement among tools (Fleiss’ $$\kappa = 0.113$$ κ = 0.113 ) such that 47.7% of all detected intron retentions were not called by more than one tool. We also observed poor performance of all tools, with none achieving an F1-score greater than 0.26, and qualitatively different behaviors between general-purpose alternative splicing detection tools and tools confined to retained intron detection. Conclusions Short-read tools detect intron retention with poor recall and precision, calling into question the completeness and validity of a large percentage of putatively retained introns called by commonly used methods.
first_indexed 2024-04-11T08:03:09Z
format Article
id doaj.art-ea62db3678a941dabe74f0c088a5f2e6
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-04-11T08:03:09Z
publishDate 2022-11-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-ea62db3678a941dabe74f0c088a5f2e62022-12-22T04:35:39ZengBMCGenome Biology1474-760X2022-11-0123112210.1186/s13059-022-02789-6Retained introns in long RNA-seq reads are not reliably detected in sample-matched short readsJulianne K. David0Sean K. Maden1Mary A. Wood2Reid F. Thompson3Abhinav Nellore4Computational Biology Program, Oregon Health & Science UniversityComputational Biology Program, Oregon Health & Science UniversityComputational Biology Program, Oregon Health & Science UniversityComputational Biology Program, Oregon Health & Science UniversityComputational Biology Program, Oregon Health & Science UniversityAbstract Background There is growing interest in retained introns in a variety of disease contexts including cancer and aging. Many software tools have been developed to detect retained introns from short RNA-seq reads, but reliable detection is complicated by overlapping genes and transcripts as well as the presence of unprocessed or partially processed RNAs. Results We compared introns detected by 8 tools using short RNA-seq reads with introns observed in long RNA-seq reads from the same biological specimens. We found significant disagreement among tools (Fleiss’ $$\kappa = 0.113$$ κ = 0.113 ) such that 47.7% of all detected intron retentions were not called by more than one tool. We also observed poor performance of all tools, with none achieving an F1-score greater than 0.26, and qualitatively different behaviors between general-purpose alternative splicing detection tools and tools confined to retained intron detection. Conclusions Short-read tools detect intron retention with poor recall and precision, calling into question the completeness and validity of a large percentage of putatively retained introns called by commonly used methods.https://doi.org/10.1186/s13059-022-02789-6RNA-seqSplicingIntron retention
spellingShingle Julianne K. David
Sean K. Maden
Mary A. Wood
Reid F. Thompson
Abhinav Nellore
Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads
Genome Biology
RNA-seq
Splicing
Intron retention
title Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads
title_full Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads
title_fullStr Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads
title_full_unstemmed Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads
title_short Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads
title_sort retained introns in long rna seq reads are not reliably detected in sample matched short reads
topic RNA-seq
Splicing
Intron retention
url https://doi.org/10.1186/s13059-022-02789-6
work_keys_str_mv AT juliannekdavid retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads
AT seankmaden retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads
AT maryawood retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads
AT reidfthompson retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads
AT abhinavnellore retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads