Using model selection algorithms to obtain reliable coefficient estimates.

This review surveys a number of common model selection algorithms (MSAs), discusses how they relate to each other and identifies factors that explain their relative performances. At the heart of MSA performance is the trade-off between type I and type II errors. Some relevant variables will be mista...

Full description

Bibliographic Details
Main Authors: Castle, J, Qin, X, Reed, W
Format: Journal article
Language:English
Published: Blackwell Publishing 2011
Description
Summary:This review surveys a number of common model selection algorithms (MSAs), discusses how they relate to each other and identifies factors that explain their relative performances. At the heart of MSA performance is the trade-off between type I and type II errors. Some relevant variables will be mistakenly excluded, and some irrelevant variables will be retained by chance. A successful MSA will find the optimal trade-off between the two types of errors for a given data environment. Whether a given MSA will be successful in a given environment depends on the relative costs of these two types of errors. We use Monte Carlo experimentation to illustrate these issues. We confirm that no MSA does best in all circumstances. Even the worst MSA in terms of overall performance – the strategy of including all candidate variables – sometimes performs best (viz., when all candidate variables are relevant). We also show how (1) the ratio of relevant to total candidate variables and (2) data-generating process noise affect relative MSA performance. Finally, we discuss a number of issues complicating the task of MSAs in producing reliable coefficient estimates.