Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering

Despite the scRNA-seq analytic algorithms developed, their performance for cell clustering cannot be quantified due to the unknown “true” clusters. Referencing the transcriptomic heterogeneity of cell clusters, a “true” mRNA number matrix of cell individuals was defined as ground truth. Based on the...

Full description

Bibliographic Details
Main Authors: Yunhe Liu, Aoshen Wu, Xueqing Peng, Xiaona Liu, Gang Liu, Lei Liu
Format: Article
Language:English
Published: MDPI AG 2021-07-01
Series:Life
Subjects:
Online Access:https://www.mdpi.com/2075-1729/11/7/716
_version_ 1797526733325336576
author Yunhe Liu
Aoshen Wu
Xueqing Peng
Xiaona Liu
Gang Liu
Lei Liu
author_facet Yunhe Liu
Aoshen Wu
Xueqing Peng
Xiaona Liu
Gang Liu
Lei Liu
author_sort Yunhe Liu
collection DOAJ
description Despite the scRNA-seq analytic algorithms developed, their performance for cell clustering cannot be quantified due to the unknown “true” clusters. Referencing the transcriptomic heterogeneity of cell clusters, a “true” mRNA number matrix of cell individuals was defined as ground truth. Based on the matrix and the actual data generation procedure, a simulation program (SSCRNA) for raw data was developed. Subsequently, the consistency between simulated data and real data was evaluated. Furthermore, the impact of sequencing depth and algorithms for analyses on cluster accuracy was quantified. As a result, the simulation result was highly consistent with that of the actual data. Among the clustering algorithms, the Gaussian normalization method was the more recommended. As for the clustering algorithms, the K-means clustering method was more stable than K-means plus Louvain clustering. In conclusion, the scRNA simulation algorithm developed restores the actual data generation process, discovers the impact of parameters on classification, compares the normalization/clustering algorithms, and provides novel insight into scRNA analyses.
first_indexed 2024-03-10T09:34:30Z
format Article
id doaj.art-999755b613dd49509e4ea656c1ee6ee4
institution Directory Open Access Journal
issn 2075-1729
language English
last_indexed 2024-03-10T09:34:30Z
publishDate 2021-07-01
publisher MDPI AG
record_format Article
series Life
spelling doaj.art-999755b613dd49509e4ea656c1ee6ee42023-11-22T04:13:38ZengMDPI AGLife2075-17292021-07-0111771610.3390/life11070716Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on ClusteringYunhe Liu0Aoshen Wu1Xueqing Peng2Xiaona Liu3Gang Liu4Lei Liu5Institute of Biomedical Sciences, Fudan University, Shanghai 200000, ChinaInstitute of Biomedical Sciences, Fudan University, Shanghai 200000, ChinaInstitute of Biomedical Sciences, Fudan University, Shanghai 200000, ChinaInstitute of Biomedical Sciences, Fudan University, Shanghai 200000, ChinaInstitute of Biomedical Sciences, Fudan University, Shanghai 200000, ChinaInstitute of Biomedical Sciences, Fudan University, Shanghai 200000, ChinaDespite the scRNA-seq analytic algorithms developed, their performance for cell clustering cannot be quantified due to the unknown “true” clusters. Referencing the transcriptomic heterogeneity of cell clusters, a “true” mRNA number matrix of cell individuals was defined as ground truth. Based on the matrix and the actual data generation procedure, a simulation program (SSCRNA) for raw data was developed. Subsequently, the consistency between simulated data and real data was evaluated. Furthermore, the impact of sequencing depth and algorithms for analyses on cluster accuracy was quantified. As a result, the simulation result was highly consistent with that of the actual data. Among the clustering algorithms, the Gaussian normalization method was the more recommended. As for the clustering algorithms, the K-means clustering method was more stable than K-means plus Louvain clustering. In conclusion, the scRNA simulation algorithm developed restores the actual data generation process, discovers the impact of parameters on classification, compares the normalization/clustering algorithms, and provides novel insight into scRNA analyses.https://www.mdpi.com/2075-1729/11/7/716single cellbioinformaticssimulationclusteringcell type annotation
spellingShingle Yunhe Liu
Aoshen Wu
Xueqing Peng
Xiaona Liu
Gang Liu
Lei Liu
Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
Life
single cell
bioinformatics
simulation
clustering
cell type annotation
title Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_full Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_fullStr Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_full_unstemmed Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_short Single-Cell Transcriptome Profiling Simulation Reveals the Impact of Sequencing Parameters and Algorithms on Clustering
title_sort single cell transcriptome profiling simulation reveals the impact of sequencing parameters and algorithms on clustering
topic single cell
bioinformatics
simulation
clustering
cell type annotation
url https://www.mdpi.com/2075-1729/11/7/716
work_keys_str_mv AT yunheliu singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT aoshenwu singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT xueqingpeng singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT xiaonaliu singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT gangliu singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering
AT leiliu singlecelltranscriptomeprofilingsimulationrevealstheimpactofsequencingparametersandalgorithmsonclustering