Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.

Statistical models are simple mathematical rules derived from empirical data describing the association between an outcome and several explanatory variables. In a typical modeling situation statistical analysis often involves a large number of potential explanatory variables and frequently only part...

Full description

Bibliographic Details
Main Authors: Daniela Dunkler, Max Plischke, Karen Leffondré, Georg Heinze
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4240713?pdf=render
_version_ 1818030704136552448
author Daniela Dunkler
Max Plischke
Karen Leffondré
Georg Heinze
author_facet Daniela Dunkler
Max Plischke
Karen Leffondré
Georg Heinze
author_sort Daniela Dunkler
collection DOAJ
description Statistical models are simple mathematical rules derived from empirical data describing the association between an outcome and several explanatory variables. In a typical modeling situation statistical analysis often involves a large number of potential explanatory variables and frequently only partial subject-matter knowledge is available. Therefore, selecting the most suitable variables for a model in an objective and practical manner is usually a non-trivial task. We briefly revisit the purposeful variable selection procedure suggested by Hosmer and Lemeshow which combines significance and change-in-estimate criteria for variable selection and critically discuss the change-in-estimate criterion. We show that using a significance-based threshold for the change-in-estimate criterion reduces to a simple significance-based selection of variables, as if the change-in-estimate criterion is not considered at all. Various extensions to the purposeful variable selection procedure are suggested. We propose to use backward elimination augmented with a standardized change-in-estimate criterion on the quantity of interest usually reported and interpreted in a model for variable selection. Augmented backward elimination has been implemented in a SAS macro for linear, logistic and Cox proportional hazards regression. The algorithm and its implementation were evaluated by means of a simulation study. Augmented backward elimination tends to select larger models than backward elimination and approximates the unselected model up to negligible differences in point estimates of the regression coefficients. On average, regression coefficients obtained after applying augmented backward elimination were less biased relative to the coefficients of correctly specified models than after backward elimination. In summary, we propose augmented backward elimination as a reproducible variable selection algorithm that gives the analyst more flexibility in adopting model selection to a specific statistical modeling situation.
first_indexed 2024-12-10T05:39:48Z
format Article
id doaj.art-39e2bd43cc6d4543803f81a05917edcb
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-10T05:39:48Z
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-39e2bd43cc6d4543803f81a05917edcb2022-12-22T02:00:18ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-01911e11367710.1371/journal.pone.0113677Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.Daniela DunklerMax PlischkeKaren LeffondréGeorg HeinzeStatistical models are simple mathematical rules derived from empirical data describing the association between an outcome and several explanatory variables. In a typical modeling situation statistical analysis often involves a large number of potential explanatory variables and frequently only partial subject-matter knowledge is available. Therefore, selecting the most suitable variables for a model in an objective and practical manner is usually a non-trivial task. We briefly revisit the purposeful variable selection procedure suggested by Hosmer and Lemeshow which combines significance and change-in-estimate criteria for variable selection and critically discuss the change-in-estimate criterion. We show that using a significance-based threshold for the change-in-estimate criterion reduces to a simple significance-based selection of variables, as if the change-in-estimate criterion is not considered at all. Various extensions to the purposeful variable selection procedure are suggested. We propose to use backward elimination augmented with a standardized change-in-estimate criterion on the quantity of interest usually reported and interpreted in a model for variable selection. Augmented backward elimination has been implemented in a SAS macro for linear, logistic and Cox proportional hazards regression. The algorithm and its implementation were evaluated by means of a simulation study. Augmented backward elimination tends to select larger models than backward elimination and approximates the unselected model up to negligible differences in point estimates of the regression coefficients. On average, regression coefficients obtained after applying augmented backward elimination were less biased relative to the coefficients of correctly specified models than after backward elimination. In summary, we propose augmented backward elimination as a reproducible variable selection algorithm that gives the analyst more flexibility in adopting model selection to a specific statistical modeling situation.http://europepmc.org/articles/PMC4240713?pdf=render
spellingShingle Daniela Dunkler
Max Plischke
Karen Leffondré
Georg Heinze
Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.
PLoS ONE
title Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.
title_full Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.
title_fullStr Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.
title_full_unstemmed Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.
title_short Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.
title_sort augmented backward elimination a pragmatic and purposeful way to develop statistical models
url http://europepmc.org/articles/PMC4240713?pdf=render
work_keys_str_mv AT danieladunkler augmentedbackwardeliminationapragmaticandpurposefulwaytodevelopstatisticalmodels
AT maxplischke augmentedbackwardeliminationapragmaticandpurposefulwaytodevelopstatisticalmodels
AT karenleffondre augmentedbackwardeliminationapragmaticandpurposefulwaytodevelopstatisticalmodels
AT georgheinze augmentedbackwardeliminationapragmaticandpurposefulwaytodevelopstatisticalmodels