SampleQC: robust multivariate, multi-cell type, multi-sample quality control for single-cell data

Abstract Quality control (QC) is a critical component of single-cell RNA-seq (scRNA-seq) processing pipelines. Current approaches to QC implicitly assume that datasets are comprised of one cell type, potentially resulting in biased exclusion of rare cell types. We introduce SampleQC, which robustly...

Full description

Bibliographic Details
Main Authors: Will Macnair, Mark Robinson
Format: Article
Language:English
Published: BMC 2023-02-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-023-02859-3
_version_ 1811165857619902464
author Will Macnair
Mark Robinson
author_facet Will Macnair
Mark Robinson
author_sort Will Macnair
collection DOAJ
description Abstract Quality control (QC) is a critical component of single-cell RNA-seq (scRNA-seq) processing pipelines. Current approaches to QC implicitly assume that datasets are comprised of one cell type, potentially resulting in biased exclusion of rare cell types. We introduce SampleQC, which robustly fits a Gaussian mixture model across multiple samples, improves sensitivity, and reduces bias compared to current approaches. We show via simulations that SampleQC is less susceptible to exclusion of rarer cell types. We also demonstrate SampleQC on a complex real dataset (867k cells over 172 samples). SampleQC is general, is implemented in R, and could be applied to other data types.
first_indexed 2024-04-10T15:44:20Z
format Article
id doaj.art-8b268e80edcc45ecae31ce50cb52d616
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-04-10T15:44:20Z
publishDate 2023-02-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-8b268e80edcc45ecae31ce50cb52d6162023-02-12T12:13:43ZengBMCGenome Biology1474-760X2023-02-0124112210.1186/s13059-023-02859-3SampleQC: robust multivariate, multi-cell type, multi-sample quality control for single-cell dataWill Macnair0Mark Robinson1Department of Molecular Life Sciences, University of ZürichDepartment of Molecular Life Sciences, University of ZürichAbstract Quality control (QC) is a critical component of single-cell RNA-seq (scRNA-seq) processing pipelines. Current approaches to QC implicitly assume that datasets are comprised of one cell type, potentially resulting in biased exclusion of rare cell types. We introduce SampleQC, which robustly fits a Gaussian mixture model across multiple samples, improves sensitivity, and reduces bias compared to current approaches. We show via simulations that SampleQC is less susceptible to exclusion of rarer cell types. We also demonstrate SampleQC on a complex real dataset (867k cells over 172 samples). SampleQC is general, is implemented in R, and could be applied to other data types.https://doi.org/10.1186/s13059-023-02859-3Single cellSingle-cell RNA-seqQuality control
spellingShingle Will Macnair
Mark Robinson
SampleQC: robust multivariate, multi-cell type, multi-sample quality control for single-cell data
Genome Biology
Single cell
Single-cell RNA-seq
Quality control
title SampleQC: robust multivariate, multi-cell type, multi-sample quality control for single-cell data
title_full SampleQC: robust multivariate, multi-cell type, multi-sample quality control for single-cell data
title_fullStr SampleQC: robust multivariate, multi-cell type, multi-sample quality control for single-cell data
title_full_unstemmed SampleQC: robust multivariate, multi-cell type, multi-sample quality control for single-cell data
title_short SampleQC: robust multivariate, multi-cell type, multi-sample quality control for single-cell data
title_sort sampleqc robust multivariate multi cell type multi sample quality control for single cell data
topic Single cell
Single-cell RNA-seq
Quality control
url https://doi.org/10.1186/s13059-023-02859-3
work_keys_str_mv AT willmacnair sampleqcrobustmultivariatemulticelltypemultisamplequalitycontrolforsinglecelldata
AT markrobinson sampleqcrobustmultivariatemulticelltypemultisamplequalitycontrolforsinglecelldata