A Multilevel Bayesian Approach to Improve Effect Size Estimation in Regression Modeling of Metabolomics Data Utilizing Imputation with Uncertainty

To ensure scientific reproducibility of metabolomics data, alternative statistical methods are needed. A paradigm shift away from the <i>p</i>-value toward an embracement of uncertainty and interval estimation of a metabolite’s true effect size may lead to improved study design and great...

Full description

Bibliographic Details
Main Authors: Christopher E. Gillies, Theodore S. Jennaro, Michael A. Puskarich, Ruchi Sharma, Kevin R. Ward, Xudong Fan, Alan E. Jones, Kathleen A. Stringer
Format: Article
Language:English
Published: MDPI AG 2020-08-01
Series:Metabolites
Subjects:
Online Access:https://www.mdpi.com/2218-1989/10/8/319
_version_ 1797559978706337792
author Christopher E. Gillies
Theodore S. Jennaro
Michael A. Puskarich
Ruchi Sharma
Kevin R. Ward
Xudong Fan
Alan E. Jones
Kathleen A. Stringer
author_facet Christopher E. Gillies
Theodore S. Jennaro
Michael A. Puskarich
Ruchi Sharma
Kevin R. Ward
Xudong Fan
Alan E. Jones
Kathleen A. Stringer
author_sort Christopher E. Gillies
collection DOAJ
description To ensure scientific reproducibility of metabolomics data, alternative statistical methods are needed. A paradigm shift away from the <i>p</i>-value toward an embracement of uncertainty and interval estimation of a metabolite’s true effect size may lead to improved study design and greater reproducibility. Multilevel Bayesian models are one approach that offer the added opportunity of incorporating imputed value uncertainty when missing data are present. We designed simulations of metabolomics data to compare multilevel Bayesian models to standard logistic regression with corrections for multiple hypothesis testing. Our simulations altered the sample size and the fraction of significant metabolites truly different between two outcome groups. We then introduced missingness to further assess model performance. Across simulations, the multilevel Bayesian approach more accurately estimated the effect size of metabolites that were significantly different between groups. Bayesian models also had greater power and mitigated the false discovery rate. In the presence of increased missing data, Bayesian models were able to accurately impute the <i>true</i> concentration and incorporating the uncertainty of these estimates improved overall prediction. In summary, our simulations demonstrate that a multilevel Bayesian approach accurately quantifies the estimated effect size of metabolite predictors in regression modeling, particularly in the presence of missing data.
first_indexed 2024-03-10T17:52:55Z
format Article
id doaj.art-7bd5a4ac2a32402ea6913bafb9d4eb74
institution Directory Open Access Journal
issn 2218-1989
language English
last_indexed 2024-03-10T17:52:55Z
publishDate 2020-08-01
publisher MDPI AG
record_format Article
series Metabolites
spelling doaj.art-7bd5a4ac2a32402ea6913bafb9d4eb742023-11-20T09:17:07ZengMDPI AGMetabolites2218-19892020-08-0110831910.3390/metabo10080319A Multilevel Bayesian Approach to Improve Effect Size Estimation in Regression Modeling of Metabolomics Data Utilizing Imputation with UncertaintyChristopher E. Gillies0Theodore S. Jennaro1Michael A. Puskarich2Ruchi Sharma3Kevin R. Ward4Xudong Fan5Alan E. Jones6Kathleen A. Stringer7Department of Emergency Medicine, University of Michigan, Ann Arbor, MI 48109, USADepartment of Clinical Pharmacy, College of Pharmacy, University of Michigan, Ann Arbor, MI 48109, USADepartment of Emergency Medicine, University of Minnesota, Minneapolis, MN 55455, USADepartment of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USADepartment of Emergency Medicine, University of Michigan, Ann Arbor, MI 48109, USAMichigan Center for Integrative Research in Critical Care (MCIRCC), University of Michigan, Ann Arbor, MI 48109, USADepartment of Emergency Medicine, University of Mississippi Medical Center, Jackson, MS 39216, USAMichigan Center for Integrative Research in Critical Care (MCIRCC), University of Michigan, Ann Arbor, MI 48109, USATo ensure scientific reproducibility of metabolomics data, alternative statistical methods are needed. A paradigm shift away from the <i>p</i>-value toward an embracement of uncertainty and interval estimation of a metabolite’s true effect size may lead to improved study design and greater reproducibility. Multilevel Bayesian models are one approach that offer the added opportunity of incorporating imputed value uncertainty when missing data are present. We designed simulations of metabolomics data to compare multilevel Bayesian models to standard logistic regression with corrections for multiple hypothesis testing. Our simulations altered the sample size and the fraction of significant metabolites truly different between two outcome groups. We then introduced missingness to further assess model performance. Across simulations, the multilevel Bayesian approach more accurately estimated the effect size of metabolites that were significantly different between groups. Bayesian models also had greater power and mitigated the false discovery rate. In the presence of increased missing data, Bayesian models were able to accurately impute the <i>true</i> concentration and incorporating the uncertainty of these estimates improved overall prediction. In summary, our simulations demonstrate that a multilevel Bayesian approach accurately quantifies the estimated effect size of metabolite predictors in regression modeling, particularly in the presence of missing data.https://www.mdpi.com/2218-1989/10/8/319hierarchical modelingnuclear magnetic resonance spectroscopyBayesian statisticsmissing valuesimputationmultiple test corrections
spellingShingle Christopher E. Gillies
Theodore S. Jennaro
Michael A. Puskarich
Ruchi Sharma
Kevin R. Ward
Xudong Fan
Alan E. Jones
Kathleen A. Stringer
A Multilevel Bayesian Approach to Improve Effect Size Estimation in Regression Modeling of Metabolomics Data Utilizing Imputation with Uncertainty
Metabolites
hierarchical modeling
nuclear magnetic resonance spectroscopy
Bayesian statistics
missing values
imputation
multiple test corrections
title A Multilevel Bayesian Approach to Improve Effect Size Estimation in Regression Modeling of Metabolomics Data Utilizing Imputation with Uncertainty
title_full A Multilevel Bayesian Approach to Improve Effect Size Estimation in Regression Modeling of Metabolomics Data Utilizing Imputation with Uncertainty
title_fullStr A Multilevel Bayesian Approach to Improve Effect Size Estimation in Regression Modeling of Metabolomics Data Utilizing Imputation with Uncertainty
title_full_unstemmed A Multilevel Bayesian Approach to Improve Effect Size Estimation in Regression Modeling of Metabolomics Data Utilizing Imputation with Uncertainty
title_short A Multilevel Bayesian Approach to Improve Effect Size Estimation in Regression Modeling of Metabolomics Data Utilizing Imputation with Uncertainty
title_sort multilevel bayesian approach to improve effect size estimation in regression modeling of metabolomics data utilizing imputation with uncertainty
topic hierarchical modeling
nuclear magnetic resonance spectroscopy
Bayesian statistics
missing values
imputation
multiple test corrections
url https://www.mdpi.com/2218-1989/10/8/319
work_keys_str_mv AT christopheregillies amultilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT theodoresjennaro amultilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT michaelapuskarich amultilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT ruchisharma amultilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT kevinrward amultilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT xudongfan amultilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT alanejones amultilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT kathleenastringer amultilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT christopheregillies multilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT theodoresjennaro multilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT michaelapuskarich multilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT ruchisharma multilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT kevinrward multilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT xudongfan multilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT alanejones multilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty
AT kathleenastringer multilevelbayesianapproachtoimproveeffectsizeestimationinregressionmodelingofmetabolomicsdatautilizingimputationwithuncertainty