Adjustment for baseline characteristics in randomized trials using logistic regression: sample-based model versus true model

Abstract Background Adjustment for baseline prognostic factors in randomized clinical trials is usually performed by means of sample-based regression models. Sample-based models may be incorrect due to overfitting. To assess whether overfitting is a problem in practice, we used simulated data to exa...

Full description

Bibliographic Details
Main Authors:	Thomas Perneger, Christophe Combescure, Antoine Poncet
Format:	Article
Language:	English
Published:	BMC 2023-02-01
Series:	Trials
Subjects:	Randomized clinical trials Baseline imbalance Statistical adjustment Over-fitting Simulation study
Online Access:	https://doi.org/10.1186/s13063-022-07053-7

_version_	1797863629579616256
author	Thomas Perneger Christophe Combescure Antoine Poncet
author_facet	Thomas Perneger Christophe Combescure Antoine Poncet
author_sort	Thomas Perneger
collection	DOAJ
description	Abstract Background Adjustment for baseline prognostic factors in randomized clinical trials is usually performed by means of sample-based regression models. Sample-based models may be incorrect due to overfitting. To assess whether overfitting is a problem in practice, we used simulated data to examine the performance of the sample-based model in comparison to a “true” adjustment model, in terms of estimation of the treatment effect. Methods We conducted a simulation study using samples drawn from a “population” in which both the treatment effect and the effect of the potential confounder were specified. The outcome variable was binary. Using logistic regression, we compared three estimates of the treatment effect in each situation: unadjusted, adjusted for the confounder using the sample, adjusted for the confounder using the true effect. Experimental factors were sample size (from 2 × 50 to 2 × 1000), treatment effect (logit of 0, 0.5, or 1.0), confounder type (continuous or binary), and confounder effect (logit of 0, − 0.5, or − 1.0). The assessment criteria for the estimated treatment effect were bias, variance, precision (proportion of estimates within 0.1 logit units), type 1 error, and power. Results Sample-based adjustment models yielded more biased estimates of the treatment effect than adjustment models that used the true confounder effect but had similar variance, accuracy, power, and type 1 error rates. The simulation also confirmed the conservative bias of unadjusted analyses due to the non-collapsibility of the odds ratio, the smaller variance of unadjusted estimates, and the bias of the odds ratio away from the null hypothesis in small datasets. Conclusions Sample-based adjustment yields similar results to exact adjustment in estimating the treatment effect. Sample-based adjustment is preferable to no adjustment.
first_indexed	2024-04-09T22:39:42Z
format	Article
id	doaj.art-c16110b37a8d4372b10ba6143827999f
institution	Directory Open Access Journal
issn	1745-6215
language	English
last_indexed	2024-04-09T22:39:42Z
publishDate	2023-02-01
publisher	BMC
record_format	Article
series	Trials
spelling	doaj.art-c16110b37a8d4372b10ba6143827999f2023-03-22T12:17:56ZengBMCTrials1745-62152023-02-012411910.1186/s13063-022-07053-7Adjustment for baseline characteristics in randomized trials using logistic regression: sample-based model versus true modelThomas Perneger0Christophe Combescure1Antoine Poncet2Division of Clinical Epidemiology, University of Geneva and Geneva University HospitalsDivision of Clinical Epidemiology, University of Geneva and Geneva University HospitalsDivision of Clinical Epidemiology, University of Geneva and Geneva University HospitalsAbstract Background Adjustment for baseline prognostic factors in randomized clinical trials is usually performed by means of sample-based regression models. Sample-based models may be incorrect due to overfitting. To assess whether overfitting is a problem in practice, we used simulated data to examine the performance of the sample-based model in comparison to a “true” adjustment model, in terms of estimation of the treatment effect. Methods We conducted a simulation study using samples drawn from a “population” in which both the treatment effect and the effect of the potential confounder were specified. The outcome variable was binary. Using logistic regression, we compared three estimates of the treatment effect in each situation: unadjusted, adjusted for the confounder using the sample, adjusted for the confounder using the true effect. Experimental factors were sample size (from 2 × 50 to 2 × 1000), treatment effect (logit of 0, 0.5, or 1.0), confounder type (continuous or binary), and confounder effect (logit of 0, − 0.5, or − 1.0). The assessment criteria for the estimated treatment effect were bias, variance, precision (proportion of estimates within 0.1 logit units), type 1 error, and power. Results Sample-based adjustment models yielded more biased estimates of the treatment effect than adjustment models that used the true confounder effect but had similar variance, accuracy, power, and type 1 error rates. The simulation also confirmed the conservative bias of unadjusted analyses due to the non-collapsibility of the odds ratio, the smaller variance of unadjusted estimates, and the bias of the odds ratio away from the null hypothesis in small datasets. Conclusions Sample-based adjustment yields similar results to exact adjustment in estimating the treatment effect. Sample-based adjustment is preferable to no adjustment.https://doi.org/10.1186/s13063-022-07053-7Randomized clinical trialsBaseline imbalanceStatistical adjustmentOver-fittingSimulation study
spellingShingle	Thomas Perneger Christophe Combescure Antoine Poncet Adjustment for baseline characteristics in randomized trials using logistic regression: sample-based model versus true model Trials Randomized clinical trials Baseline imbalance Statistical adjustment Over-fitting Simulation study
title	Adjustment for baseline characteristics in randomized trials using logistic regression: sample-based model versus true model
title_full	Adjustment for baseline characteristics in randomized trials using logistic regression: sample-based model versus true model
title_fullStr	Adjustment for baseline characteristics in randomized trials using logistic regression: sample-based model versus true model
title_full_unstemmed	Adjustment for baseline characteristics in randomized trials using logistic regression: sample-based model versus true model
title_short	Adjustment for baseline characteristics in randomized trials using logistic regression: sample-based model versus true model
title_sort	adjustment for baseline characteristics in randomized trials using logistic regression sample based model versus true model
topic	Randomized clinical trials Baseline imbalance Statistical adjustment Over-fitting Simulation study
url	https://doi.org/10.1186/s13063-022-07053-7
work_keys_str_mv	AT thomasperneger adjustmentforbaselinecharacteristicsinrandomizedtrialsusinglogisticregressionsamplebasedmodelversustruemodel AT christophecombescure adjustmentforbaselinecharacteristicsinrandomizedtrialsusinglogisticregressionsamplebasedmodelversustruemodel AT antoineponcet adjustmentforbaselinecharacteristicsinrandomizedtrialsusinglogisticregressionsamplebasedmodelversustruemodel

Adjustment for baseline characteristics in randomized trials using logistic regression: sample-based model versus true model

Similar Items