A computational method to aid the design and analysis of single cell RNA-seq experiments for cell type identification

Abstract Background The advent of single cell RNA sequencing (scRNA-seq) enabled researchers to study transcriptomic activity within individual cells and identify inherent cell types in the sample. Although numerous computational tools have been developed to analyze single cell transcriptomes, there...

Full description

Bibliographic Details
Main Authors: Douglas Abrams, Parveen Kumar, R. Krishna Murthy Karuturi, Joshy George
Format: Article
Language:English
Published: BMC 2019-06-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2817-2
_version_ 1818937074099683328
author Douglas Abrams
Parveen Kumar
R. Krishna Murthy Karuturi
Joshy George
author_facet Douglas Abrams
Parveen Kumar
R. Krishna Murthy Karuturi
Joshy George
author_sort Douglas Abrams
collection DOAJ
description Abstract Background The advent of single cell RNA sequencing (scRNA-seq) enabled researchers to study transcriptomic activity within individual cells and identify inherent cell types in the sample. Although numerous computational tools have been developed to analyze single cell transcriptomes, there are no published studies and analytical packages available to guide experimental design and to devise suitable analysis procedure for cell type identification. Results We have developed an empirical methodology to address this important gap in single cell experimental design and analysis into an easy-to-use tool called SCEED (Single Cell Empirical Experimental Design and analysis). With SCEED, user can choose a variety of combinations of tools for analysis, conduct performance analysis of analytical procedures and choose the best procedure, and estimate sample size (number of cells to be profiled) required for a given analytical procedure at varying levels of cell type rarity and other experimental parameters. Using SCEED, we examined 3 single cell algorithms using 48 simulated single cell datasets that were generated for varying number of cell types and their proportions, number of genes expressed per cell, number of marker genes and their fold change, and number of single cells successfully profiled in the experiment. Conclusions Based on our study, we found that when marker genes are expressed at fold change of 4 or more, either Seurat or SIMLR algorithm can be used to analyze single cell dataset for any number of single cells isolated (minimum 1000 single cells were tested). However, when marker genes are expected to be only up to fold change of 2, choice of the single cell algorithm is dependent on the number of single cells isolated and rarity of cell types to be identified. In conclusion, our work allows the assessment of various single cell methods and also aids in the design of single cell experiments.
first_indexed 2024-12-20T05:46:10Z
format Article
id doaj.art-079dcdad99db49d695804b5edb78ac7f
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-20T05:46:10Z
publishDate 2019-06-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-079dcdad99db49d695804b5edb78ac7f2022-12-21T19:51:18ZengBMCBMC Bioinformatics1471-21052019-06-0120S111610.1186/s12859-019-2817-2A computational method to aid the design and analysis of single cell RNA-seq experiments for cell type identificationDouglas Abrams0Parveen Kumar1R. Krishna Murthy Karuturi2Joshy George3Colby CollegeThe Jackson Laboratory for Genomic MedicineThe Jackson Laboratory for Genomic MedicineThe Jackson Laboratory for Genomic MedicineAbstract Background The advent of single cell RNA sequencing (scRNA-seq) enabled researchers to study transcriptomic activity within individual cells and identify inherent cell types in the sample. Although numerous computational tools have been developed to analyze single cell transcriptomes, there are no published studies and analytical packages available to guide experimental design and to devise suitable analysis procedure for cell type identification. Results We have developed an empirical methodology to address this important gap in single cell experimental design and analysis into an easy-to-use tool called SCEED (Single Cell Empirical Experimental Design and analysis). With SCEED, user can choose a variety of combinations of tools for analysis, conduct performance analysis of analytical procedures and choose the best procedure, and estimate sample size (number of cells to be profiled) required for a given analytical procedure at varying levels of cell type rarity and other experimental parameters. Using SCEED, we examined 3 single cell algorithms using 48 simulated single cell datasets that were generated for varying number of cell types and their proportions, number of genes expressed per cell, number of marker genes and their fold change, and number of single cells successfully profiled in the experiment. Conclusions Based on our study, we found that when marker genes are expressed at fold change of 4 or more, either Seurat or SIMLR algorithm can be used to analyze single cell dataset for any number of single cells isolated (minimum 1000 single cells were tested). However, when marker genes are expected to be only up to fold change of 2, choice of the single cell algorithm is dependent on the number of single cells isolated and rarity of cell types to be identified. In conclusion, our work allows the assessment of various single cell methods and also aids in the design of single cell experiments.http://link.springer.com/article/10.1186/s12859-019-2817-2Single cell RNA-seqCell-type identificationClusteringExperimental designAnalysis design
spellingShingle Douglas Abrams
Parveen Kumar
R. Krishna Murthy Karuturi
Joshy George
A computational method to aid the design and analysis of single cell RNA-seq experiments for cell type identification
BMC Bioinformatics
Single cell RNA-seq
Cell-type identification
Clustering
Experimental design
Analysis design
title A computational method to aid the design and analysis of single cell RNA-seq experiments for cell type identification
title_full A computational method to aid the design and analysis of single cell RNA-seq experiments for cell type identification
title_fullStr A computational method to aid the design and analysis of single cell RNA-seq experiments for cell type identification
title_full_unstemmed A computational method to aid the design and analysis of single cell RNA-seq experiments for cell type identification
title_short A computational method to aid the design and analysis of single cell RNA-seq experiments for cell type identification
title_sort computational method to aid the design and analysis of single cell rna seq experiments for cell type identification
topic Single cell RNA-seq
Cell-type identification
Clustering
Experimental design
Analysis design
url http://link.springer.com/article/10.1186/s12859-019-2817-2
work_keys_str_mv AT douglasabrams acomputationalmethodtoaidthedesignandanalysisofsinglecellrnaseqexperimentsforcelltypeidentification
AT parveenkumar acomputationalmethodtoaidthedesignandanalysisofsinglecellrnaseqexperimentsforcelltypeidentification
AT rkrishnamurthykaruturi acomputationalmethodtoaidthedesignandanalysisofsinglecellrnaseqexperimentsforcelltypeidentification
AT joshygeorge acomputationalmethodtoaidthedesignandanalysisofsinglecellrnaseqexperimentsforcelltypeidentification
AT douglasabrams computationalmethodtoaidthedesignandanalysisofsinglecellrnaseqexperimentsforcelltypeidentification
AT parveenkumar computationalmethodtoaidthedesignandanalysisofsinglecellrnaseqexperimentsforcelltypeidentification
AT rkrishnamurthykaruturi computationalmethodtoaidthedesignandanalysisofsinglecellrnaseqexperimentsforcelltypeidentification
AT joshygeorge computationalmethodtoaidthedesignandanalysisofsinglecellrnaseqexperimentsforcelltypeidentification