Repeated holdout validation for weighted quantile sum regression

Weighted Quantile Sum (WQS) regression is a method commonly used in environmental epidemiology to assess the impact of chemical mixtures in relation to a health outcome of interest. Data are partitioned into a single training and test set to reduce sample-specific chemical weights. However, in typic...

Full description

Bibliographic Details
Main Authors: Eva M. Tanner, Carl-Gustaf Bornehag, Chris Gennings
Format: Article
Language:English
Published: Elsevier 2019-01-01
Series:MethodsX
Online Access:http://www.sciencedirect.com/science/article/pii/S2215016119303103
_version_ 1818381928858910720
author Eva M. Tanner
Carl-Gustaf Bornehag
Chris Gennings
author_facet Eva M. Tanner
Carl-Gustaf Bornehag
Chris Gennings
author_sort Eva M. Tanner
collection DOAJ
description Weighted Quantile Sum (WQS) regression is a method commonly used in environmental epidemiology to assess the impact of chemical mixtures in relation to a health outcome of interest. Data are partitioned into a single training and test set to reduce sample-specific chemical weights. However, in typical epidemiology sample sizes, this may produce unstable chemical weights and WQS index estimates, and investigators may resort to training and testing on the same data. To solve this problem, we propose repeated holdout validation whereby data are randomly partitioned 100 times, producing a distribution of validated results. Taking the mean as the final estimate, confidence estimates may also be calculated for inference. Further, this method helps characterize the variability in chemical weights, aiding in the identification of chemicals of concern. This is important since it may direct future research into specific chemicals.Using data from 718 mother-child pairs in the Swedish Environmental Longitudinal, Mother and Child, Asthma and Allergy (SELMA) study, we assessed the association between prenatal exposure to 26 endocrine disrupting chemicals and child Intelligence Quotient (IQ). Results using a single partition were unstable, varying by random seed. The WQS index estimate was significant when all data was used (e.g. no partition) (β = −2.2 CI = −3.43, −0.98), but attenuated and nonsignificant using repeated holdout validation (β = −0.82 CI = −2.11, 0.45). When implementing WQS in epidemiologic studies with limited sample sizes, repeated holdout validation is a viable alternative to using a single, or no partitioning. Repeated holdout can both stabilize results and help characterize the uncertainty in identifying chemicals of concern, while maintaining some of the the rigor of holdout validation. • Repeated holdout validation improves the stability of WQS estimates in finite study samples • Uncertainty in identifying toxic chemicals of concern is acknowledged and characterized Method name: Repeated holdout validation for weighted quantile sum regression, Keywords: Environmental epidemiology, Chemical mixtures, Cross-validation, Bootstrap, Uncertainty plot, Chemical of concern
first_indexed 2024-12-14T02:42:22Z
format Article
id doaj.art-39412ac45d69499e88da78e601ee513c
institution Directory Open Access Journal
issn 2215-0161
language English
last_indexed 2024-12-14T02:42:22Z
publishDate 2019-01-01
publisher Elsevier
record_format Article
series MethodsX
spelling doaj.art-39412ac45d69499e88da78e601ee513c2022-12-21T23:19:58ZengElsevierMethodsX2215-01612019-01-01628552860Repeated holdout validation for weighted quantile sum regressionEva M. Tanner0Carl-Gustaf Bornehag1Chris Gennings2Icahn School of Medicine at Mount Sinai, New York, NY, United States; Corresponding author.Icahn School of Medicine at Mount Sinai, New York, NY, United States; Karlstad University, Karlstad, SwedenIcahn School of Medicine at Mount Sinai, New York, NY, United StatesWeighted Quantile Sum (WQS) regression is a method commonly used in environmental epidemiology to assess the impact of chemical mixtures in relation to a health outcome of interest. Data are partitioned into a single training and test set to reduce sample-specific chemical weights. However, in typical epidemiology sample sizes, this may produce unstable chemical weights and WQS index estimates, and investigators may resort to training and testing on the same data. To solve this problem, we propose repeated holdout validation whereby data are randomly partitioned 100 times, producing a distribution of validated results. Taking the mean as the final estimate, confidence estimates may also be calculated for inference. Further, this method helps characterize the variability in chemical weights, aiding in the identification of chemicals of concern. This is important since it may direct future research into specific chemicals.Using data from 718 mother-child pairs in the Swedish Environmental Longitudinal, Mother and Child, Asthma and Allergy (SELMA) study, we assessed the association between prenatal exposure to 26 endocrine disrupting chemicals and child Intelligence Quotient (IQ). Results using a single partition were unstable, varying by random seed. The WQS index estimate was significant when all data was used (e.g. no partition) (β = −2.2 CI = −3.43, −0.98), but attenuated and nonsignificant using repeated holdout validation (β = −0.82 CI = −2.11, 0.45). When implementing WQS in epidemiologic studies with limited sample sizes, repeated holdout validation is a viable alternative to using a single, or no partitioning. Repeated holdout can both stabilize results and help characterize the uncertainty in identifying chemicals of concern, while maintaining some of the the rigor of holdout validation. • Repeated holdout validation improves the stability of WQS estimates in finite study samples • Uncertainty in identifying toxic chemicals of concern is acknowledged and characterized Method name: Repeated holdout validation for weighted quantile sum regression, Keywords: Environmental epidemiology, Chemical mixtures, Cross-validation, Bootstrap, Uncertainty plot, Chemical of concernhttp://www.sciencedirect.com/science/article/pii/S2215016119303103
spellingShingle Eva M. Tanner
Carl-Gustaf Bornehag
Chris Gennings
Repeated holdout validation for weighted quantile sum regression
MethodsX
title Repeated holdout validation for weighted quantile sum regression
title_full Repeated holdout validation for weighted quantile sum regression
title_fullStr Repeated holdout validation for weighted quantile sum regression
title_full_unstemmed Repeated holdout validation for weighted quantile sum regression
title_short Repeated holdout validation for weighted quantile sum regression
title_sort repeated holdout validation for weighted quantile sum regression
url http://www.sciencedirect.com/science/article/pii/S2215016119303103
work_keys_str_mv AT evamtanner repeatedholdoutvalidationforweightedquantilesumregression
AT carlgustafbornehag repeatedholdoutvalidationforweightedquantilesumregression
AT chrisgennings repeatedholdoutvalidationforweightedquantilesumregression