A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions
Abstract Background Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade ad...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-08-01
|
Series: | Microbiome |
Subjects: | |
Online Access: | https://doi.org/10.1186/s40168-022-01320-0 |
_version_ | 1811340418885877760 |
---|---|
author | Lu Yang Jun Chen |
author_facet | Lu Yang Jun Chen |
author_sort | Lu Yang |
collection | DOAJ |
description | Abstract Background Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade addressing the special characteristics of microbiome data such as zero inflation and compositional effects. Disturbingly, different DAA tools could sometimes produce quite discordant results, opening to the possibility of cherry-picking the tool in favor of one’s own hypothesis. To recommend the best DAA tool or practice to the field, a comprehensive evaluation, which covers as many biologically relevant scenarios as possible, is critically needed. Results We performed by far the most comprehensive evaluation of existing DAA tools using real data-based simulations. We found that DAA methods explicitly addressing compositional effects such as ANCOM-BC, Aldex2, metagenomeSeq (fitFeatureModel), and DACOMP did have improved performance in false-positive control. But they are still not optimal: type 1 error inflation or low statistical power has been observed in many settings. The recent LDM method generally had the best power, but its false-positive control in the presence of strong compositional effects was not satisfactory. Overall, none of the evaluated methods is simultaneously robust, powerful, and flexible, which makes the selection of the best DAA tool difficult. To meet the analysis needs, we designed an optimized procedure, ZicoSeq, drawing on the strength of the existing DAA methods. We show that ZicoSeq generally controlled for false positives across settings, and the power was among the highest. Application of DAA methods to a large collection of real datasets revealed a similar pattern observed in simulation studies. Conclusions Based on the benchmarking study, we conclude that none of the existing DAA methods evaluated can be applied blindly to any real microbiome dataset. The applicability of an existing DAA method depends on specific settings, which are usually unknown a priori. To circumvent the difficulty of selecting the best DAA tool in practice, we design ZicoSeq, which addresses the major challenges in DAA and remedies the drawbacks of existing DAA methods. ZicoSeq can be applied to microbiome datasets from diverse settings and is a useful DAA tool for robust microbiome biomarker discovery. Video Abstract |
first_indexed | 2024-04-13T18:41:37Z |
format | Article |
id | doaj.art-4c245bb82c744a438f7b3ddcf9ae8a7c |
institution | Directory Open Access Journal |
issn | 2049-2618 |
language | English |
last_indexed | 2024-04-13T18:41:37Z |
publishDate | 2022-08-01 |
publisher | BMC |
record_format | Article |
series | Microbiome |
spelling | doaj.art-4c245bb82c744a438f7b3ddcf9ae8a7c2022-12-22T02:34:42ZengBMCMicrobiome2049-26182022-08-0110112310.1186/s40168-022-01320-0A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutionsLu Yang0Jun Chen1Division of Computational Biology, Department of Quantitative Health Sciences, Mayo ClinicDivision of Computational Biology, Department of Quantitative Health Sciences, Mayo ClinicAbstract Background Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade addressing the special characteristics of microbiome data such as zero inflation and compositional effects. Disturbingly, different DAA tools could sometimes produce quite discordant results, opening to the possibility of cherry-picking the tool in favor of one’s own hypothesis. To recommend the best DAA tool or practice to the field, a comprehensive evaluation, which covers as many biologically relevant scenarios as possible, is critically needed. Results We performed by far the most comprehensive evaluation of existing DAA tools using real data-based simulations. We found that DAA methods explicitly addressing compositional effects such as ANCOM-BC, Aldex2, metagenomeSeq (fitFeatureModel), and DACOMP did have improved performance in false-positive control. But they are still not optimal: type 1 error inflation or low statistical power has been observed in many settings. The recent LDM method generally had the best power, but its false-positive control in the presence of strong compositional effects was not satisfactory. Overall, none of the evaluated methods is simultaneously robust, powerful, and flexible, which makes the selection of the best DAA tool difficult. To meet the analysis needs, we designed an optimized procedure, ZicoSeq, drawing on the strength of the existing DAA methods. We show that ZicoSeq generally controlled for false positives across settings, and the power was among the highest. Application of DAA methods to a large collection of real datasets revealed a similar pattern observed in simulation studies. Conclusions Based on the benchmarking study, we conclude that none of the existing DAA methods evaluated can be applied blindly to any real microbiome dataset. The applicability of an existing DAA method depends on specific settings, which are usually unknown a priori. To circumvent the difficulty of selecting the best DAA tool in practice, we design ZicoSeq, which addresses the major challenges in DAA and remedies the drawbacks of existing DAA methods. ZicoSeq can be applied to microbiome datasets from diverse settings and is a useful DAA tool for robust microbiome biomarker discovery. Video Abstracthttps://doi.org/10.1186/s40168-022-01320-0MicrobiomeMetagenomicsStatistical methodsDifferential abundance analysisFalse discovery rateCompositional effects |
spellingShingle | Lu Yang Jun Chen A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions Microbiome Microbiome Metagenomics Statistical methods Differential abundance analysis False discovery rate Compositional effects |
title | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_full | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_fullStr | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_full_unstemmed | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_short | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_sort | comprehensive evaluation of microbial differential abundance analysis methods current status and potential solutions |
topic | Microbiome Metagenomics Statistical methods Differential abundance analysis False discovery rate Compositional effects |
url | https://doi.org/10.1186/s40168-022-01320-0 |
work_keys_str_mv | AT luyang acomprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions AT junchen acomprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions AT luyang comprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions AT junchen comprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions |