ProgClust: A progressive clustering method to identify cell populations

Identifying different types of cells in scRNA-seq data is a critical task in single-cell data analysis. In this paper, we propose a method called ProgClust for the decomposition of cell populations and detection of rare cells. ProgClust represents the single-cell data with clustering trees where a p...

Full description

Bibliographic Details
Main Authors: Han Li, Ying Wang, Yongxuan Lai, Feng Zeng, Fan Yang
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-04-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2023.1183099/full
_version_ 1797851401691332608
author Han Li
Ying Wang
Ying Wang
Ying Wang
Yongxuan Lai
Feng Zeng
Feng Zeng
Feng Zeng
Fan Yang
Fan Yang
Fan Yang
author_facet Han Li
Ying Wang
Ying Wang
Ying Wang
Yongxuan Lai
Feng Zeng
Feng Zeng
Feng Zeng
Fan Yang
Fan Yang
Fan Yang
author_sort Han Li
collection DOAJ
description Identifying different types of cells in scRNA-seq data is a critical task in single-cell data analysis. In this paper, we propose a method called ProgClust for the decomposition of cell populations and detection of rare cells. ProgClust represents the single-cell data with clustering trees where a progressive searching method is designed to select cell population-specific genes and cluster cells. The obtained trees reveal the structure of both abundant cell populations and rare cell populations. Additionally, it can automatically determine the number of clusters. Experimental results show that ProgClust outperforms the baseline method and is capable of accurately identifying both common and rare cells. Moreover, when applied to real unlabeled data, it reveals potential cell subpopulations which provides clues for further exploration. In summary, ProgClust shows potential in identifying subpopulations of complex single-cell data.
first_indexed 2024-04-09T19:17:19Z
format Article
id doaj.art-e02ba0763a5649f997c721c624067099
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-04-09T19:17:19Z
publishDate 2023-04-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-e02ba0763a5649f997c721c6240670992023-04-06T05:02:29ZengFrontiers Media S.A.Frontiers in Genetics1664-80212023-04-011410.3389/fgene.2023.11830991183099ProgClust: A progressive clustering method to identify cell populationsHan Li0Ying Wang1Ying Wang2Ying Wang3Yongxuan Lai4Feng Zeng5Feng Zeng6Feng Zeng7Fan Yang8Fan Yang9Fan Yang10Department of Automation, Xiamen University, Xiamen, ChinaDepartment of Automation, Xiamen University, Xiamen, ChinaNational Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, ChinaXiamen Key Lab Big Data Intelligent Anal and Decis, Xiamen, ChinaSchool of Informatics, Xiamen University, Xiamen, ChinaDepartment of Automation, Xiamen University, Xiamen, ChinaNational Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, ChinaXiamen Key Lab Big Data Intelligent Anal and Decis, Xiamen, ChinaDepartment of Automation, Xiamen University, Xiamen, ChinaNational Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, ChinaXiamen Key Lab Big Data Intelligent Anal and Decis, Xiamen, ChinaIdentifying different types of cells in scRNA-seq data is a critical task in single-cell data analysis. In this paper, we propose a method called ProgClust for the decomposition of cell populations and detection of rare cells. ProgClust represents the single-cell data with clustering trees where a progressive searching method is designed to select cell population-specific genes and cluster cells. The obtained trees reveal the structure of both abundant cell populations and rare cell populations. Additionally, it can automatically determine the number of clusters. Experimental results show that ProgClust outperforms the baseline method and is capable of accurately identifying both common and rare cells. Moreover, when applied to real unlabeled data, it reveals potential cell subpopulations which provides clues for further exploration. In summary, ProgClust shows potential in identifying subpopulations of complex single-cell data.https://www.frontiersin.org/articles/10.3389/fgene.2023.1183099/fullScRNA-seqsingle-cell clusteringensemble clusteringrare cellunbalanced data
spellingShingle Han Li
Ying Wang
Ying Wang
Ying Wang
Yongxuan Lai
Feng Zeng
Feng Zeng
Feng Zeng
Fan Yang
Fan Yang
Fan Yang
ProgClust: A progressive clustering method to identify cell populations
Frontiers in Genetics
ScRNA-seq
single-cell clustering
ensemble clustering
rare cell
unbalanced data
title ProgClust: A progressive clustering method to identify cell populations
title_full ProgClust: A progressive clustering method to identify cell populations
title_fullStr ProgClust: A progressive clustering method to identify cell populations
title_full_unstemmed ProgClust: A progressive clustering method to identify cell populations
title_short ProgClust: A progressive clustering method to identify cell populations
title_sort progclust a progressive clustering method to identify cell populations
topic ScRNA-seq
single-cell clustering
ensemble clustering
rare cell
unbalanced data
url https://www.frontiersin.org/articles/10.3389/fgene.2023.1183099/full
work_keys_str_mv AT hanli progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT yingwang progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT yingwang progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT yingwang progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT yongxuanlai progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT fengzeng progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT fengzeng progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT fengzeng progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT fanyang progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT fanyang progclustaprogressiveclusteringmethodtoidentifycellpopulations
AT fanyang progclustaprogressiveclusteringmethodtoidentifycellpopulations