Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST

Abstract Spatially resolved transcriptomics involves a set of emerging technologies that enable the transcriptomic profiling of tissues with the physical location of expressions. Although a variety of methods have been developed for data integration, most of them are for single-cell RNA-seq datasets...

Full description

Bibliographic Details
Main Authors: Wei Liu, Xu Liao, Ziye Luo, Yi Yang, Mai Chan Lau, Yuling Jiao, Xingjie Shi, Weiwei Zhai, Hongkai Ji, Joe Yeong, Jin Liu
Format: Article
Language:English
Published: Nature Portfolio 2023-01-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-023-35947-w
_version_ 1797558323495567360
author Wei Liu
Xu Liao
Ziye Luo
Yi Yang
Mai Chan Lau
Yuling Jiao
Xingjie Shi
Weiwei Zhai
Hongkai Ji
Joe Yeong
Jin Liu
author_facet Wei Liu
Xu Liao
Ziye Luo
Yi Yang
Mai Chan Lau
Yuling Jiao
Xingjie Shi
Weiwei Zhai
Hongkai Ji
Joe Yeong
Jin Liu
author_sort Wei Liu
collection DOAJ
description Abstract Spatially resolved transcriptomics involves a set of emerging technologies that enable the transcriptomic profiling of tissues with the physical location of expressions. Although a variety of methods have been developed for data integration, most of them are for single-cell RNA-seq datasets without consideration of spatial information. Thus, methods that can integrate spatial transcriptomics data from multiple tissue slides, possibly from multiple individuals, are needed. Here, we present PRECAST, a data integration method for multiple spatial transcriptomics datasets with complex batch effects and/or biological effects between slides. PRECAST unifies spatial factor analysis simultaneously with spatial clustering and embedding alignment, while requiring only partially shared cell/domain clusters across datasets. Using both simulated and four real datasets, we show improved cell/domain detection with outstanding visualization, and the estimated aligned embeddings and cell/domain labels facilitate many downstream analyses. We demonstrate that PRECAST is computationally scalable and applicable to spatial transcriptomics datasets from different platforms.
first_indexed 2024-03-10T17:29:53Z
format Article
id doaj.art-9abd87ac205140bb9a3046b91896c166
institution Directory Open Access Journal
issn 2041-1723
language English
last_indexed 2024-03-10T17:29:53Z
publishDate 2023-01-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj.art-9abd87ac205140bb9a3046b91896c1662023-11-20T10:04:41ZengNature PortfolioNature Communications2041-17232023-01-0114111810.1038/s41467-023-35947-wProbabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECASTWei Liu0Xu Liao1Ziye Luo2Yi Yang3Mai Chan Lau4Yuling Jiao5Xingjie Shi6Weiwei Zhai7Hongkai Ji8Joe Yeong9Jin Liu10Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical SchoolCentre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical SchoolCentre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical SchoolCentre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical SchoolInstitute of Molecular and Cell Biology (IMCB), Agency of Science, Technology and Research (A*STAR)School of Mathematics and Statistics, Wuhan UniversityAcademy of Statistics and Interdisciplinary Sciences, East China Normal UniversityKey Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of SciencesDepartment of Biostatistics, Johns Hopkins Bloomberg School of Public HealthInstitute of Molecular and Cell Biology (IMCB), Agency of Science, Technology and Research (A*STAR)Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical SchoolAbstract Spatially resolved transcriptomics involves a set of emerging technologies that enable the transcriptomic profiling of tissues with the physical location of expressions. Although a variety of methods have been developed for data integration, most of them are for single-cell RNA-seq datasets without consideration of spatial information. Thus, methods that can integrate spatial transcriptomics data from multiple tissue slides, possibly from multiple individuals, are needed. Here, we present PRECAST, a data integration method for multiple spatial transcriptomics datasets with complex batch effects and/or biological effects between slides. PRECAST unifies spatial factor analysis simultaneously with spatial clustering and embedding alignment, while requiring only partially shared cell/domain clusters across datasets. Using both simulated and four real datasets, we show improved cell/domain detection with outstanding visualization, and the estimated aligned embeddings and cell/domain labels facilitate many downstream analyses. We demonstrate that PRECAST is computationally scalable and applicable to spatial transcriptomics datasets from different platforms.https://doi.org/10.1038/s41467-023-35947-w
spellingShingle Wei Liu
Xu Liao
Ziye Luo
Yi Yang
Mai Chan Lau
Yuling Jiao
Xingjie Shi
Weiwei Zhai
Hongkai Ji
Joe Yeong
Jin Liu
Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST
Nature Communications
title Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST
title_full Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST
title_fullStr Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST
title_full_unstemmed Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST
title_short Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST
title_sort probabilistic embedding clustering and alignment for integrating spatial transcriptomics data with precast
url https://doi.org/10.1038/s41467-023-35947-w
work_keys_str_mv AT weiliu probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT xuliao probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT ziyeluo probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT yiyang probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT maichanlau probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT yulingjiao probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT xingjieshi probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT weiweizhai probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT hongkaiji probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT joeyeong probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast
AT jinliu probabilisticembeddingclusteringandalignmentforintegratingspatialtranscriptomicsdatawithprecast