Not every estimate counts – evaluation of cell composition estimation approaches in brain bulk tissue data

Abstract Background Variation in cell composition can dramatically impact analyses in bulk tissue samples. A commonly employed approach to mitigate this issue is to adjust statistical models using estimates of cell abundance derived directly from omics data. While an arsenal of estimation methods ex...

Full description

Bibliographic Details
Main Authors: Lilah Toker, Gonzalo S. Nido, Charalampos Tzoulis
Format: Article
Language:English
Published: BMC 2023-06-01
Series:Genome Medicine
Subjects:
Online Access:https://doi.org/10.1186/s13073-023-01195-2
_version_ 1827928565911388160
author Lilah Toker
Gonzalo S. Nido
Charalampos Tzoulis
author_facet Lilah Toker
Gonzalo S. Nido
Charalampos Tzoulis
author_sort Lilah Toker
collection DOAJ
description Abstract Background Variation in cell composition can dramatically impact analyses in bulk tissue samples. A commonly employed approach to mitigate this issue is to adjust statistical models using estimates of cell abundance derived directly from omics data. While an arsenal of estimation methods exists, the applicability of these methods to brain tissue data and whether or not cell estimates can sufficiently account for confounding cellular composition has not been adequately assessed. Methods We assessed the correspondence between different estimation methods based on transcriptomic (RNA sequencing, RNA-seq) and epigenomic (DNA methylation and histone acetylation) data from brain tissue samples of 49 individuals. We further evaluated the impact of different estimation approaches on the analysis of H3K27 acetylation chromatin immunoprecipitation sequencing (ChIP-seq) data from entorhinal cortex of individuals with Alzheimer’s disease and controls. Results We show that even closely adjacent tissue samples from the same Brodmann area vary greatly in their cell composition. Comparison across different estimation methods indicates that while different estimation methods applied to the same data produce highly similar outcomes, there is a surprisingly low concordance between estimates based on different omics data modalities. Alarmingly, we show that cell type estimates may not always sufficiently account for confounding variation in cell composition. Conclusions Our work indicates that cell composition estimation or direct quantification in one tissue sample should not be used as a proxy to the cellular composition of another tissue sample from the same brain region of an individual—even if the samples are directly adjacent. The highly similar outcomes observed among vastly different estimation methods, highlight the need for brain benchmark datasets and better validation approaches. Finally, unless validated through complementary experiments, the interpretation of analyses outcomes based on data confounded by cell composition should be done with great caution, and ideally avoided all together.
first_indexed 2024-03-13T06:09:29Z
format Article
id doaj.art-87c277d2929745d1b8453c534ea2d91b
institution Directory Open Access Journal
issn 1756-994X
language English
last_indexed 2024-03-13T06:09:29Z
publishDate 2023-06-01
publisher BMC
record_format Article
series Genome Medicine
spelling doaj.art-87c277d2929745d1b8453c534ea2d91b2023-06-11T11:21:19ZengBMCGenome Medicine1756-994X2023-06-0115111410.1186/s13073-023-01195-2Not every estimate counts – evaluation of cell composition estimation approaches in brain bulk tissue dataLilah Toker0Gonzalo S. Nido1Charalampos Tzoulis2Neuro-SysMed Center of Excellence, Department of Neurology, Department of Clinical Medicine, Haukeland University Hospital, University of BergenNeuro-SysMed Center of Excellence, Department of Neurology, Department of Clinical Medicine, Haukeland University Hospital, University of BergenNeuro-SysMed Center of Excellence, Department of Neurology, Department of Clinical Medicine, Haukeland University Hospital, University of BergenAbstract Background Variation in cell composition can dramatically impact analyses in bulk tissue samples. A commonly employed approach to mitigate this issue is to adjust statistical models using estimates of cell abundance derived directly from omics data. While an arsenal of estimation methods exists, the applicability of these methods to brain tissue data and whether or not cell estimates can sufficiently account for confounding cellular composition has not been adequately assessed. Methods We assessed the correspondence between different estimation methods based on transcriptomic (RNA sequencing, RNA-seq) and epigenomic (DNA methylation and histone acetylation) data from brain tissue samples of 49 individuals. We further evaluated the impact of different estimation approaches on the analysis of H3K27 acetylation chromatin immunoprecipitation sequencing (ChIP-seq) data from entorhinal cortex of individuals with Alzheimer’s disease and controls. Results We show that even closely adjacent tissue samples from the same Brodmann area vary greatly in their cell composition. Comparison across different estimation methods indicates that while different estimation methods applied to the same data produce highly similar outcomes, there is a surprisingly low concordance between estimates based on different omics data modalities. Alarmingly, we show that cell type estimates may not always sufficiently account for confounding variation in cell composition. Conclusions Our work indicates that cell composition estimation or direct quantification in one tissue sample should not be used as a proxy to the cellular composition of another tissue sample from the same brain region of an individual—even if the samples are directly adjacent. The highly similar outcomes observed among vastly different estimation methods, highlight the need for brain benchmark datasets and better validation approaches. Finally, unless validated through complementary experiments, the interpretation of analyses outcomes based on data confounded by cell composition should be done with great caution, and ideally avoided all together.https://doi.org/10.1186/s13073-023-01195-2Cell compositionNeurodegenerationOmicsDeconvolutionBrainBulk tissue
spellingShingle Lilah Toker
Gonzalo S. Nido
Charalampos Tzoulis
Not every estimate counts – evaluation of cell composition estimation approaches in brain bulk tissue data
Genome Medicine
Cell composition
Neurodegeneration
Omics
Deconvolution
Brain
Bulk tissue
title Not every estimate counts – evaluation of cell composition estimation approaches in brain bulk tissue data
title_full Not every estimate counts – evaluation of cell composition estimation approaches in brain bulk tissue data
title_fullStr Not every estimate counts – evaluation of cell composition estimation approaches in brain bulk tissue data
title_full_unstemmed Not every estimate counts – evaluation of cell composition estimation approaches in brain bulk tissue data
title_short Not every estimate counts – evaluation of cell composition estimation approaches in brain bulk tissue data
title_sort not every estimate counts evaluation of cell composition estimation approaches in brain bulk tissue data
topic Cell composition
Neurodegeneration
Omics
Deconvolution
Brain
Bulk tissue
url https://doi.org/10.1186/s13073-023-01195-2
work_keys_str_mv AT lilahtoker noteveryestimatecountsevaluationofcellcompositionestimationapproachesinbrainbulktissuedata
AT gonzalosnido noteveryestimatecountsevaluationofcellcompositionestimationapproachesinbrainbulktissuedata
AT charalampostzoulis noteveryestimatecountsevaluationofcellcompositionestimationapproachesinbrainbulktissuedata