Model Selection in Generalized Linear Models

The problem of model selection in regression analysis through the use of forward selection, backward elimination, and stepwise selection has been well explored in the literature. The main assumption in this, of course, is that the data are normally distributed and the main tool used here is either a...

Full description

Bibliographic Details
Main Authors: Abdulla Mamun, Sudhir Paul
Format: Article
Language:English
Published: MDPI AG 2023-10-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/15/10/1905
_version_ 1797572182557065216
author Abdulla Mamun
Sudhir Paul
author_facet Abdulla Mamun
Sudhir Paul
author_sort Abdulla Mamun
collection DOAJ
description The problem of model selection in regression analysis through the use of forward selection, backward elimination, and stepwise selection has been well explored in the literature. The main assumption in this, of course, is that the data are normally distributed and the main tool used here is either a <i>t</i> test or an <i>F</i> test. However, the properties of these model selection procedures are not well-known. The purpose of this paper is to study the properties of these procedures within generalized linear regression models, considering the normal linear regression model as a special case. The main tool that is being used is the score test. However, the <i>F</i> test and other large sample tests, such as the likelihood ratio and the Wald test, the AIC, and the BIC, are included for the comparison. A systematic study, through simulations, of the properties of this procedure was conducted, in terms of level and power, for symmetric and asymmetric distributions, such as normal, Poisson, and binomial regression models. Extensions for skewed distributions, over-dispersed Poisson (the negative binomial), and over-dispersed binomial (the beta-binomial) regression models, are also given and evaluated. The methods are applied to analyze two health datasets.
first_indexed 2024-03-10T20:51:43Z
format Article
id doaj.art-50b013a33715447e8f3188934cd39f41
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-03-10T20:51:43Z
publishDate 2023-10-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-50b013a33715447e8f3188934cd39f412023-11-19T18:18:31ZengMDPI AGSymmetry2073-89942023-10-011510190510.3390/sym15101905Model Selection in Generalized Linear ModelsAbdulla Mamun0Sudhir Paul1Department of Mathematics, Gonzaga University, Spokane, WA 99258-0102, USADepartment of Mathematics and Statistics, University of Windsor, Windsor, ON N9B 3P4, CanadaThe problem of model selection in regression analysis through the use of forward selection, backward elimination, and stepwise selection has been well explored in the literature. The main assumption in this, of course, is that the data are normally distributed and the main tool used here is either a <i>t</i> test or an <i>F</i> test. However, the properties of these model selection procedures are not well-known. The purpose of this paper is to study the properties of these procedures within generalized linear regression models, considering the normal linear regression model as a special case. The main tool that is being used is the score test. However, the <i>F</i> test and other large sample tests, such as the likelihood ratio and the Wald test, the AIC, and the BIC, are included for the comparison. A systematic study, through simulations, of the properties of this procedure was conducted, in terms of level and power, for symmetric and asymmetric distributions, such as normal, Poisson, and binomial regression models. Extensions for skewed distributions, over-dispersed Poisson (the negative binomial), and over-dispersed binomial (the beta-binomial) regression models, are also given and evaluated. The methods are applied to analyze two health datasets.https://www.mdpi.com/2073-8994/15/10/1905generalized linear modelover-dispersionscore testWald testlikelihood ratio test
spellingShingle Abdulla Mamun
Sudhir Paul
Model Selection in Generalized Linear Models
Symmetry
generalized linear model
over-dispersion
score test
Wald test
likelihood ratio test
title Model Selection in Generalized Linear Models
title_full Model Selection in Generalized Linear Models
title_fullStr Model Selection in Generalized Linear Models
title_full_unstemmed Model Selection in Generalized Linear Models
title_short Model Selection in Generalized Linear Models
title_sort model selection in generalized linear models
topic generalized linear model
over-dispersion
score test
Wald test
likelihood ratio test
url https://www.mdpi.com/2073-8994/15/10/1905
work_keys_str_mv AT abdullamamun modelselectioningeneralizedlinearmodels
AT sudhirpaul modelselectioningeneralizedlinearmodels