Effects of multicollinearity on type I error rate and test power of binary logistic regression model: A simulation study

In this study, the effect of multicollinearity, which is defined as high correlation, on the type I error rate and test power of the binary logistic regression models were studied. To do this, one dependent variable that consists of 1 and 2 and four continuous independent variables that were randoml...

Full description

Bibliographic Details
Main Authors: Yeliz Kasko Arici, Mustafa Muhip Ozkan, Zahide Kocabas
Format: Article
Language:English
Published: Society of Turaz Bilim 2023-12-01
Series:Medicine Science
Subjects:
Online Access:https://www.medicinescience.org/?mno=165383
_version_ 1797316508984016896
author Yeliz Kasko Arici
Mustafa Muhip Ozkan
Zahide Kocabas
author_facet Yeliz Kasko Arici
Mustafa Muhip Ozkan
Zahide Kocabas
author_sort Yeliz Kasko Arici
collection DOAJ
description In this study, the effect of multicollinearity, which is defined as high correlation, on the type I error rate and test power of the binary logistic regression models were studied. To do this, one dependent variable that consists of 1 and 2 and four continuous independent variables that were randomly drawn from the standardized normal distribution were taken into consideration in the constructed binary logistic regression model. To calculate the type I error rates and test power, the simulation study was performed 100.000 times. The simulation study repeated for the sample sizes of 10, 20, 30 and 40 the various degrees of correlations, namely 0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% and 90%. In order to calculate test power, differences (δ) were created between population means by adding to the mean of the second population in standard deviation units as 0.5, 1.0, 1.5 and 2.0, respectively. The simulation runs exhibited that the increasing degree of multicollinearity among independent variables had no influence on type I error rates, provided that the sample size should not be smaller than 30. The power of the binary logistic regression was least affected by the increasing degrees of multicollinearity when the sample size is 10 and there is a 0.5δ difference between population means. The fact that there was a marked decline in test power with rising multicollinearity for all sample sizes clarified that the binary logistic regression was most powerful if there is no strong multicollinearity among independent variables when there is a 1.0δ difference between population means. The negative impact of the rising degrees of multicollinearity on the test power can be avoided if the number of observations is sufficiently large and if the populations were satisfactorily separated from each other. [Med-Science 2023; 12(4.000): 1180-4]
first_indexed 2024-03-08T03:20:23Z
format Article
id doaj.art-2a8cb7ccfafe4bb6a51c387723609e43
institution Directory Open Access Journal
issn 2147-0634
language English
last_indexed 2024-03-08T03:20:23Z
publishDate 2023-12-01
publisher Society of Turaz Bilim
record_format Article
series Medicine Science
spelling doaj.art-2a8cb7ccfafe4bb6a51c387723609e432024-02-12T10:32:08ZengSociety of Turaz BilimMedicine Science2147-06342023-12-011241180410.5455/medscience.2023.08.146165383Effects of multicollinearity on type I error rate and test power of binary logistic regression model: A simulation studyYeliz Kasko Arici0Mustafa Muhip Ozkan1Zahide Kocabas2Ordu University, Faculty of Medicine, Department of Biostatistics and Medical Informatics Ankara University, Department of Biometry and Genetics Ankara University, Department of Biometry and GeneticsIn this study, the effect of multicollinearity, which is defined as high correlation, on the type I error rate and test power of the binary logistic regression models were studied. To do this, one dependent variable that consists of 1 and 2 and four continuous independent variables that were randomly drawn from the standardized normal distribution were taken into consideration in the constructed binary logistic regression model. To calculate the type I error rates and test power, the simulation study was performed 100.000 times. The simulation study repeated for the sample sizes of 10, 20, 30 and 40 the various degrees of correlations, namely 0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% and 90%. In order to calculate test power, differences (δ) were created between population means by adding to the mean of the second population in standard deviation units as 0.5, 1.0, 1.5 and 2.0, respectively. The simulation runs exhibited that the increasing degree of multicollinearity among independent variables had no influence on type I error rates, provided that the sample size should not be smaller than 30. The power of the binary logistic regression was least affected by the increasing degrees of multicollinearity when the sample size is 10 and there is a 0.5δ difference between population means. The fact that there was a marked decline in test power with rising multicollinearity for all sample sizes clarified that the binary logistic regression was most powerful if there is no strong multicollinearity among independent variables when there is a 1.0δ difference between population means. The negative impact of the rising degrees of multicollinearity on the test power can be avoided if the number of observations is sufficiently large and if the populations were satisfactorily separated from each other. [Med-Science 2023; 12(4.000): 1180-4]https://www.medicinescience.org/?mno=165383logistic regressionbinary response variablemulticollinearitytype i error ratetest power
spellingShingle Yeliz Kasko Arici
Mustafa Muhip Ozkan
Zahide Kocabas
Effects of multicollinearity on type I error rate and test power of binary logistic regression model: A simulation study
Medicine Science
logistic regression
binary response variable
multicollinearity
type i error rate
test power
title Effects of multicollinearity on type I error rate and test power of binary logistic regression model: A simulation study
title_full Effects of multicollinearity on type I error rate and test power of binary logistic regression model: A simulation study
title_fullStr Effects of multicollinearity on type I error rate and test power of binary logistic regression model: A simulation study
title_full_unstemmed Effects of multicollinearity on type I error rate and test power of binary logistic regression model: A simulation study
title_short Effects of multicollinearity on type I error rate and test power of binary logistic regression model: A simulation study
title_sort effects of multicollinearity on type i error rate and test power of binary logistic regression model a simulation study
topic logistic regression
binary response variable
multicollinearity
type i error rate
test power
url https://www.medicinescience.org/?mno=165383
work_keys_str_mv AT yelizkaskoarici effectsofmulticollinearityontypeierrorrateandtestpowerofbinarylogisticregressionmodelasimulationstudy
AT mustafamuhipozkan effectsofmulticollinearityontypeierrorrateandtestpowerofbinarylogisticregressionmodelasimulationstudy
AT zahidekocabas effectsofmulticollinearityontypeierrorrateandtestpowerofbinarylogisticregressionmodelasimulationstudy