Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads
Abstract Background There is growing interest in retained introns in a variety of disease contexts including cancer and aging. Many software tools have been developed to detect retained introns from short RNA-seq reads, but reliable detection is complicated by overlapping genes and transcripts as we...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-11-01
|
Series: | Genome Biology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13059-022-02789-6 |
_version_ | 1828098542783168512 |
---|---|
author | Julianne K. David Sean K. Maden Mary A. Wood Reid F. Thompson Abhinav Nellore |
author_facet | Julianne K. David Sean K. Maden Mary A. Wood Reid F. Thompson Abhinav Nellore |
author_sort | Julianne K. David |
collection | DOAJ |
description | Abstract Background There is growing interest in retained introns in a variety of disease contexts including cancer and aging. Many software tools have been developed to detect retained introns from short RNA-seq reads, but reliable detection is complicated by overlapping genes and transcripts as well as the presence of unprocessed or partially processed RNAs. Results We compared introns detected by 8 tools using short RNA-seq reads with introns observed in long RNA-seq reads from the same biological specimens. We found significant disagreement among tools (Fleiss’ $$\kappa = 0.113$$ κ = 0.113 ) such that 47.7% of all detected intron retentions were not called by more than one tool. We also observed poor performance of all tools, with none achieving an F1-score greater than 0.26, and qualitatively different behaviors between general-purpose alternative splicing detection tools and tools confined to retained intron detection. Conclusions Short-read tools detect intron retention with poor recall and precision, calling into question the completeness and validity of a large percentage of putatively retained introns called by commonly used methods. |
first_indexed | 2024-04-11T08:03:09Z |
format | Article |
id | doaj.art-ea62db3678a941dabe74f0c088a5f2e6 |
institution | Directory Open Access Journal |
issn | 1474-760X |
language | English |
last_indexed | 2024-04-11T08:03:09Z |
publishDate | 2022-11-01 |
publisher | BMC |
record_format | Article |
series | Genome Biology |
spelling | doaj.art-ea62db3678a941dabe74f0c088a5f2e62022-12-22T04:35:39ZengBMCGenome Biology1474-760X2022-11-0123112210.1186/s13059-022-02789-6Retained introns in long RNA-seq reads are not reliably detected in sample-matched short readsJulianne K. David0Sean K. Maden1Mary A. Wood2Reid F. Thompson3Abhinav Nellore4Computational Biology Program, Oregon Health & Science UniversityComputational Biology Program, Oregon Health & Science UniversityComputational Biology Program, Oregon Health & Science UniversityComputational Biology Program, Oregon Health & Science UniversityComputational Biology Program, Oregon Health & Science UniversityAbstract Background There is growing interest in retained introns in a variety of disease contexts including cancer and aging. Many software tools have been developed to detect retained introns from short RNA-seq reads, but reliable detection is complicated by overlapping genes and transcripts as well as the presence of unprocessed or partially processed RNAs. Results We compared introns detected by 8 tools using short RNA-seq reads with introns observed in long RNA-seq reads from the same biological specimens. We found significant disagreement among tools (Fleiss’ $$\kappa = 0.113$$ κ = 0.113 ) such that 47.7% of all detected intron retentions were not called by more than one tool. We also observed poor performance of all tools, with none achieving an F1-score greater than 0.26, and qualitatively different behaviors between general-purpose alternative splicing detection tools and tools confined to retained intron detection. Conclusions Short-read tools detect intron retention with poor recall and precision, calling into question the completeness and validity of a large percentage of putatively retained introns called by commonly used methods.https://doi.org/10.1186/s13059-022-02789-6RNA-seqSplicingIntron retention |
spellingShingle | Julianne K. David Sean K. Maden Mary A. Wood Reid F. Thompson Abhinav Nellore Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads Genome Biology RNA-seq Splicing Intron retention |
title | Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads |
title_full | Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads |
title_fullStr | Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads |
title_full_unstemmed | Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads |
title_short | Retained introns in long RNA-seq reads are not reliably detected in sample-matched short reads |
title_sort | retained introns in long rna seq reads are not reliably detected in sample matched short reads |
topic | RNA-seq Splicing Intron retention |
url | https://doi.org/10.1186/s13059-022-02789-6 |
work_keys_str_mv | AT juliannekdavid retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads AT seankmaden retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads AT maryawood retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads AT reidfthompson retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads AT abhinavnellore retainedintronsinlongrnaseqreadsarenotreliablydetectedinsamplematchedshortreads |