Summary: | Background: The human microbiome can contribute to pathogeneses of many complex diseases by mediating disease-leading causal pathways. However, standard mediation analysis methods are not adequate to analyze the microbiome as a mediator due to the excessive number of zero-valued sequencing reads in the data and that the relative abundances have to sum to one. The two main challenges raised by the zero-inflated data structure are: (a) disentangling the mediation effect induced by the point mass at zero; and (b) identifying the observed zero-valued data points that are not zero (i.e., false zeros). Methods: We develop a novel marginal mediation analysis method under the potential-outcomes framework to address the issues. We also show that the marginal model can account for the compositional structure of microbiome data. Results: The mediation effect can be decomposed into two components that are inherent to the two-part nature of zero-inflated distributions. With probabilistic models to account for observing zeros, we also address the challenge with false zeros. A comprehensive simulation study and the application in a real microbiome study showcase our approach in comparison with existing approaches. Conclusions: When analyzing the zero-inflated microbiome composition as the mediators, MarZIC approach has better performance than standard causal mediation analysis approaches and existing competing approach.
|