ZINBMM: a general mixture model for simultaneous clustering and gene selection using single-cell transcriptomic data

Abstract Clustering is a critical component of single-cell RNA sequencing (scRNA-seq) data analysis and can help reveal cell types and infer cell lineages. Despite considerable successes, there are few methods tailored to investigating cluster-specific genes contributing to cell heterogeneity, which...

Full description

Bibliographic Details
Main Authors: Yang Li, Mingcong Wu, Shuangge Ma, Mengyun Wu
Format: Article
Language:English
Published: BMC 2023-09-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-023-03046-0
_version_ 1797559354965098496
author Yang Li
Mingcong Wu
Shuangge Ma
Mengyun Wu
author_facet Yang Li
Mingcong Wu
Shuangge Ma
Mengyun Wu
author_sort Yang Li
collection DOAJ
description Abstract Clustering is a critical component of single-cell RNA sequencing (scRNA-seq) data analysis and can help reveal cell types and infer cell lineages. Despite considerable successes, there are few methods tailored to investigating cluster-specific genes contributing to cell heterogeneity, which can promote biological understanding of cell heterogeneity. In this study, we propose a zero-inflated negative binomial mixture model (ZINBMM) that simultaneously achieves effective scRNA-seq data clustering and gene selection. ZINBMM conducts a systemic analysis on raw counts, accommodating both batch effects and dropout events. Simulations and the analysis of five scRNA-seq datasets demonstrate the practical applicability of ZINBMM.
first_indexed 2024-03-10T17:44:12Z
format Article
id doaj.art-ba7f164345e546fba06cafda0d292351
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-03-10T17:44:12Z
publishDate 2023-09-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-ba7f164345e546fba06cafda0d2923512023-11-20T09:35:17ZengBMCGenome Biology1474-760X2023-09-0124112810.1186/s13059-023-03046-0ZINBMM: a general mixture model for simultaneous clustering and gene selection using single-cell transcriptomic dataYang Li0Mingcong Wu1Shuangge Ma2Mengyun Wu3Center for Applied Statistics and School of Statistics, Renmin University of ChinaCenter for Applied Statistics and School of Statistics, Renmin University of ChinaDepartment of Biostatistics, Yale UniversitySchool of Statistics and Management, Shanghai University of Finance and EconomicsAbstract Clustering is a critical component of single-cell RNA sequencing (scRNA-seq) data analysis and can help reveal cell types and infer cell lineages. Despite considerable successes, there are few methods tailored to investigating cluster-specific genes contributing to cell heterogeneity, which can promote biological understanding of cell heterogeneity. In this study, we propose a zero-inflated negative binomial mixture model (ZINBMM) that simultaneously achieves effective scRNA-seq data clustering and gene selection. ZINBMM conducts a systemic analysis on raw counts, accommodating both batch effects and dropout events. Simulations and the analysis of five scRNA-seq datasets demonstrate the practical applicability of ZINBMM.https://doi.org/10.1186/s13059-023-03046-0Clustering analysisGene selectionScRNA-seq data
spellingShingle Yang Li
Mingcong Wu
Shuangge Ma
Mengyun Wu
ZINBMM: a general mixture model for simultaneous clustering and gene selection using single-cell transcriptomic data
Genome Biology
Clustering analysis
Gene selection
ScRNA-seq data
title ZINBMM: a general mixture model for simultaneous clustering and gene selection using single-cell transcriptomic data
title_full ZINBMM: a general mixture model for simultaneous clustering and gene selection using single-cell transcriptomic data
title_fullStr ZINBMM: a general mixture model for simultaneous clustering and gene selection using single-cell transcriptomic data
title_full_unstemmed ZINBMM: a general mixture model for simultaneous clustering and gene selection using single-cell transcriptomic data
title_short ZINBMM: a general mixture model for simultaneous clustering and gene selection using single-cell transcriptomic data
title_sort zinbmm a general mixture model for simultaneous clustering and gene selection using single cell transcriptomic data
topic Clustering analysis
Gene selection
ScRNA-seq data
url https://doi.org/10.1186/s13059-023-03046-0
work_keys_str_mv AT yangli zinbmmageneralmixturemodelforsimultaneousclusteringandgeneselectionusingsinglecelltranscriptomicdata
AT mingcongwu zinbmmageneralmixturemodelforsimultaneousclusteringandgeneselectionusingsinglecelltranscriptomicdata
AT shuanggema zinbmmageneralmixturemodelforsimultaneousclusteringandgeneselectionusingsinglecelltranscriptomicdata
AT mengyunwu zinbmmageneralmixturemodelforsimultaneousclusteringandgeneselectionusingsinglecelltranscriptomicdata