Length biases in single-cell RNA sequencing of pre-mRNA

Single-cell RNA sequencing data can be modeled using Markov chains to yield genome-wide insights into transcriptional physics. However, quantitative inference with such data requires careful assessment of noise sources. We find that long pre-mRNA transcripts are over-represented in sequencing data....

Full description

Bibliographic Details
Main Authors: Gennady Gorin, Lior Pachter
Format: Article
Language:English
Published: Elsevier 2023-03-01
Series:Biophysical Reports
Online Access:http://www.sciencedirect.com/science/article/pii/S2667074722000544
_version_ 1797955445471576064
author Gennady Gorin
Lior Pachter
author_facet Gennady Gorin
Lior Pachter
author_sort Gennady Gorin
collection DOAJ
description Single-cell RNA sequencing data can be modeled using Markov chains to yield genome-wide insights into transcriptional physics. However, quantitative inference with such data requires careful assessment of noise sources. We find that long pre-mRNA transcripts are over-represented in sequencing data. To explain this trend, we propose a length-based model of capture bias, which may produce false-positive observations. We solve this model and use it to find concordant parameter trends as well as systematic, mechanistically interpretable technical and biological differences in paired data sets.
first_indexed 2024-04-10T23:33:16Z
format Article
id doaj.art-c80b90a16f274bfa9f0f58977cb70063
institution Directory Open Access Journal
issn 2667-0747
language English
last_indexed 2024-04-10T23:33:16Z
publishDate 2023-03-01
publisher Elsevier
record_format Article
series Biophysical Reports
spelling doaj.art-c80b90a16f274bfa9f0f58977cb700632023-01-12T04:19:57ZengElsevierBiophysical Reports2667-07472023-03-0131100097Length biases in single-cell RNA sequencing of pre-mRNAGennady Gorin0Lior Pachter1Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CaliforniaDivision of Biology and Biological Engineering, Pasadena, California; Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California; Corresponding authorSingle-cell RNA sequencing data can be modeled using Markov chains to yield genome-wide insights into transcriptional physics. However, quantitative inference with such data requires careful assessment of noise sources. We find that long pre-mRNA transcripts are over-represented in sequencing data. To explain this trend, we propose a length-based model of capture bias, which may produce false-positive observations. We solve this model and use it to find concordant parameter trends as well as systematic, mechanistically interpretable technical and biological differences in paired data sets.http://www.sciencedirect.com/science/article/pii/S2667074722000544
spellingShingle Gennady Gorin
Lior Pachter
Length biases in single-cell RNA sequencing of pre-mRNA
Biophysical Reports
title Length biases in single-cell RNA sequencing of pre-mRNA
title_full Length biases in single-cell RNA sequencing of pre-mRNA
title_fullStr Length biases in single-cell RNA sequencing of pre-mRNA
title_full_unstemmed Length biases in single-cell RNA sequencing of pre-mRNA
title_short Length biases in single-cell RNA sequencing of pre-mRNA
title_sort length biases in single cell rna sequencing of pre mrna
url http://www.sciencedirect.com/science/article/pii/S2667074722000544
work_keys_str_mv AT gennadygorin lengthbiasesinsinglecellrnasequencingofpremrna
AT liorpachter lengthbiasesinsinglecellrnasequencingofpremrna