Modified method for removing multicollinearity problem in multiple regression model

Multicollinearity happens when two or more independent variables in a multiple regression model are highly correlated. This increases the standard errors as the coefficients cannot be estimated accurately. Insignificant variable which does not contribute to a model may also affect the interpretation...

Full description

Bibliographic Details
Main Author: Yap, Sue Jinq
Format: Thesis
Language:English
English
Published: 2014
Subjects:
Online Access:https://eprints.ums.edu.my/id/eprint/41239/1/24%20PAGES.pdf
https://eprints.ums.edu.my/id/eprint/41239/2/FULLTEXT.pdf
_version_ 1817926476331220992
author Yap, Sue Jinq
author_facet Yap, Sue Jinq
author_sort Yap, Sue Jinq
collection UMS
description Multicollinearity happens when two or more independent variables in a multiple regression model are highly correlated. This increases the standard errors as the coefficients cannot be estimated accurately. Insignificant variable which does not contribute to a model may also affect the interpretation of data. Therefore, the key objective of this work is to develop a best model that is free from multicollinearity problem and insignificant variables. Originally, there are 25 variables in the data set. Using factor analysis, correlation coefficient values and dummy transformation the following variables are identified: body weight as dependent variable, chest diameter, shoulder girth, chest girth, bicep girth, forearm girth and wrist girth each as single quantitative independent variable and ankle diameter, biacromial diameter, elbow diameter, wrist diameter and gender each as dummy variable. The interaction variables involved here is up to the fifth-order (product of 6 variables). Variables which are lowly correlated with dependent variable are not removed, but are transformed into dummy variables. This work also identifies the significance of interaction variables and variables which are lowly correlated with dependent variables in an analysis. So, applying the concept of backward elimination, multicollinearity and coefficient tests are employed to discard variables systematically from each of all possible models. Multicollinearity source variables are removed using a modified method on the Zainodin-Noraini multicollinearity remedial method. Finally, a best model is obtained, free from multicollinearity problem and insignificant variables. Interaction variables are found to play important role as the best model consists of two single quantitative independent variables (chest diameter, forearm girth), four first-order interaction variables (chest girth and wrist girth, and bicep girth each with biacromial, ankle, gender) and one second-order interaction variable (chest girth, chest diameter and shoulder girth). The highest interaction order found in the best model is up to the second-order. Variables which are lowly correlated with dependent variable (biacromial diameter, ankle diameter and gender) are found to be significant and appear in the best model as interaction variables with bicep girth, respectively. Thus, the results of this work suggest a suitable procedure for researchers when dealing with a large number of independent variables.
first_indexed 2024-12-09T00:51:58Z
format Thesis
id ums.eprints-41239
institution Universiti Malaysia Sabah
language English
English
last_indexed 2024-12-09T00:51:58Z
publishDate 2014
record_format dspace
spelling ums.eprints-412392024-10-18T07:17:12Z https://eprints.ums.edu.my/id/eprint/41239/ Modified method for removing multicollinearity problem in multiple regression model Yap, Sue Jinq QA273-280 Probabilities. Mathematical statistics Multicollinearity happens when two or more independent variables in a multiple regression model are highly correlated. This increases the standard errors as the coefficients cannot be estimated accurately. Insignificant variable which does not contribute to a model may also affect the interpretation of data. Therefore, the key objective of this work is to develop a best model that is free from multicollinearity problem and insignificant variables. Originally, there are 25 variables in the data set. Using factor analysis, correlation coefficient values and dummy transformation the following variables are identified: body weight as dependent variable, chest diameter, shoulder girth, chest girth, bicep girth, forearm girth and wrist girth each as single quantitative independent variable and ankle diameter, biacromial diameter, elbow diameter, wrist diameter and gender each as dummy variable. The interaction variables involved here is up to the fifth-order (product of 6 variables). Variables which are lowly correlated with dependent variable are not removed, but are transformed into dummy variables. This work also identifies the significance of interaction variables and variables which are lowly correlated with dependent variables in an analysis. So, applying the concept of backward elimination, multicollinearity and coefficient tests are employed to discard variables systematically from each of all possible models. Multicollinearity source variables are removed using a modified method on the Zainodin-Noraini multicollinearity remedial method. Finally, a best model is obtained, free from multicollinearity problem and insignificant variables. Interaction variables are found to play important role as the best model consists of two single quantitative independent variables (chest diameter, forearm girth), four first-order interaction variables (chest girth and wrist girth, and bicep girth each with biacromial, ankle, gender) and one second-order interaction variable (chest girth, chest diameter and shoulder girth). The highest interaction order found in the best model is up to the second-order. Variables which are lowly correlated with dependent variable (biacromial diameter, ankle diameter and gender) are found to be significant and appear in the best model as interaction variables with bicep girth, respectively. Thus, the results of this work suggest a suitable procedure for researchers when dealing with a large number of independent variables. 2014 Thesis NonPeerReviewed text en https://eprints.ums.edu.my/id/eprint/41239/1/24%20PAGES.pdf text en https://eprints.ums.edu.my/id/eprint/41239/2/FULLTEXT.pdf Yap, Sue Jinq (2014) Modified method for removing multicollinearity problem in multiple regression model. Masters thesis, Universiti Malaysia Sabah.
spellingShingle QA273-280 Probabilities. Mathematical statistics
Yap, Sue Jinq
Modified method for removing multicollinearity problem in multiple regression model
title Modified method for removing multicollinearity problem in multiple regression model
title_full Modified method for removing multicollinearity problem in multiple regression model
title_fullStr Modified method for removing multicollinearity problem in multiple regression model
title_full_unstemmed Modified method for removing multicollinearity problem in multiple regression model
title_short Modified method for removing multicollinearity problem in multiple regression model
title_sort modified method for removing multicollinearity problem in multiple regression model
topic QA273-280 Probabilities. Mathematical statistics
url https://eprints.ums.edu.my/id/eprint/41239/1/24%20PAGES.pdf
https://eprints.ums.edu.my/id/eprint/41239/2/FULLTEXT.pdf
work_keys_str_mv AT yapsuejinq modifiedmethodforremovingmulticollinearityprobleminmultipleregressionmodel