Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneity

Abstract The human microbiome, comprising microorganisms residing within and on the human body, plays a crucial role in various physiological processes and has been linked to numerous diseases. To analyze microbiome data, it is essential to account for inherent heterogeneity and variability across s...

Full description

Bibliographic Details
Main Authors:	Beibei Wang, Fengzhu Sun, Yihui Luan
Format:	Article
Language:	English
Published:	Nature Portfolio 2024-03-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-024-57670-2

_version_	1797233575293091840
author	Beibei Wang Fengzhu Sun Yihui Luan
author_facet	Beibei Wang Fengzhu Sun Yihui Luan
author_sort	Beibei Wang
collection	DOAJ
description	Abstract The human microbiome, comprising microorganisms residing within and on the human body, plays a crucial role in various physiological processes and has been linked to numerous diseases. To analyze microbiome data, it is essential to account for inherent heterogeneity and variability across samples. Normalization methods have been proposed to mitigate these variations and enhance comparability. However, the performance of these methods in predicting binary phenotypes remains understudied. This study systematically evaluates different normalization methods in microbiome data analysis and their impact on disease prediction. Our findings highlight the strengths and limitations of scaling, compositional data analysis, transformation, and batch correction methods. Scaling methods like TMM show consistent performance, while compositional data analysis methods exhibit mixed results. Transformation methods, such as Blom and NPN, demonstrate promise in capturing complex associations. Batch correction methods, including BMC and Limma, consistently outperform other approaches. However, the influence of normalization methods is constrained by population effects, disease effects, and batch effects. These results provide insights for selecting appropriate normalization approaches in microbiome research, improving predictive models, and advancing personalized medicine. Future research should explore larger and more diverse datasets and develop tailored normalization strategies for microbiome data analysis.
first_indexed	2024-04-24T16:18:21Z
format	Article
id	doaj.art-18a0be5393cb4320a19c650232edd13b
institution	Directory Open Access Journal
issn	2045-2322
language	English
last_indexed	2024-04-24T16:18:21Z
publishDate	2024-03-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj.art-18a0be5393cb4320a19c650232edd13b2024-03-31T11:20:00ZengNature PortfolioScientific Reports2045-23222024-03-0114111610.1038/s41598-024-57670-2Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneityBeibei Wang0Fengzhu Sun1Yihui Luan2Frontier Science Center for Nonlinear Expectations, Ministry of EducationQuantitative and Computational Biology Department, University of Southern CaliforniaFrontier Science Center for Nonlinear Expectations, Ministry of EducationAbstract The human microbiome, comprising microorganisms residing within and on the human body, plays a crucial role in various physiological processes and has been linked to numerous diseases. To analyze microbiome data, it is essential to account for inherent heterogeneity and variability across samples. Normalization methods have been proposed to mitigate these variations and enhance comparability. However, the performance of these methods in predicting binary phenotypes remains understudied. This study systematically evaluates different normalization methods in microbiome data analysis and their impact on disease prediction. Our findings highlight the strengths and limitations of scaling, compositional data analysis, transformation, and batch correction methods. Scaling methods like TMM show consistent performance, while compositional data analysis methods exhibit mixed results. Transformation methods, such as Blom and NPN, demonstrate promise in capturing complex associations. Batch correction methods, including BMC and Limma, consistently outperform other approaches. However, the influence of normalization methods is constrained by population effects, disease effects, and batch effects. These results provide insights for selecting appropriate normalization approaches in microbiome research, improving predictive models, and advancing personalized medicine. Future research should explore larger and more diverse datasets and develop tailored normalization strategies for microbiome data analysis.https://doi.org/10.1038/s41598-024-57670-2
spellingShingle	Beibei Wang Fengzhu Sun Yihui Luan Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneity Scientific Reports
title	Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneity
title_full	Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneity
title_fullStr	Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneity
title_full_unstemmed	Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneity
title_short	Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneity
title_sort	comparison of the effectiveness of different normalization methods for metagenomic cross study phenotype prediction under heterogeneity
url	https://doi.org/10.1038/s41598-024-57670-2
work_keys_str_mv	AT beibeiwang comparisonoftheeffectivenessofdifferentnormalizationmethodsformetagenomiccrossstudyphenotypepredictionunderheterogeneity AT fengzhusun comparisonoftheeffectivenessofdifferentnormalizationmethodsformetagenomiccrossstudyphenotypepredictionunderheterogeneity AT yihuiluan comparisonoftheeffectivenessofdifferentnormalizationmethodsformetagenomiccrossstudyphenotypepredictionunderheterogeneity

Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneity

Similar Items