Measuring the relationship of bivariate data using Hodges-Lehman estimator

The relationship of bivariate data ordinarily measured using correlation coefficient. The most commonly used correlation coefficient is the Pearson correlation coefficient. This coefficient is well-known as the best coefficient for interval or ratio bivariate data with a linear relationship. Even th...

Full description

Bibliographic Details
Main Authors: Abdullah, Suhaida, Zakaria, Nur Amira, Ahad, Nor Aishah, Yusof, Norhayati, Syed Yahaya, Sharipah Soaad
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/26970/1/ASM%20ScJ%2013%202020%201%205.pdf
_version_ 1803629224195522560
author Abdullah, Suhaida
Zakaria, Nur Amira
Ahad, Nor Aishah
Yusof, Norhayati
Syed Yahaya, Sharipah Soaad
author_facet Abdullah, Suhaida
Zakaria, Nur Amira
Ahad, Nor Aishah
Yusof, Norhayati
Syed Yahaya, Sharipah Soaad
author_sort Abdullah, Suhaida
collection UUM
description The relationship of bivariate data ordinarily measured using correlation coefficient. The most commonly used correlation coefficient is the Pearson correlation coefficient. This coefficient is well-known as the best coefficient for interval or ratio bivariate data with a linear relationship. Even though this coefficient is good under the mentioned condition, it also becomes very sensitive to a small departure from linearity.Usually, this is because of the existence of an outlier. For that reason, this paper provides new robust correlation coefficients which combine the elements of nonparametric technique from the Hodges Lehmann estimator and the parametric technique based on the Pearson correlation coefficient. This paper also introduces different scale estimators such as median and median absolute deviation (MADn) and denoted by rHL(med) and rHL(MADn) respectively. The performance of the proposed correlation coefficients is measured by the coefficient values and these values are also being compared to the Pearson correlation coefficient and several existing robust correlation coefficients. The results show that the Pearson correlation coefficient (r) with no doubt is very good under perfect data condition, but with only 10% outliers, it not only give poor correlation value but turns the direction of the relationship to negative. While the rHL(med) and rHL(MADn) offer the highest coefficient values and these values are robust to the existence of outliers by up to 30%. With very good performance under all data conditions yet simple in the calculation, the rHL(med) and rHL(MADn) is considered a good alternative to the r when need to deal with outliers
first_indexed 2024-07-04T06:34:27Z
format Article
id uum-26970
institution Universiti Utara Malaysia
language English
last_indexed 2024-07-04T06:34:27Z
publishDate 2020
record_format dspace
spelling uum-269702020-04-30T03:14:03Z https://repo.uum.edu.my/id/eprint/26970/ Measuring the relationship of bivariate data using Hodges-Lehman estimator Abdullah, Suhaida Zakaria, Nur Amira Ahad, Nor Aishah Yusof, Norhayati Syed Yahaya, Sharipah Soaad QA75 Electronic computers. Computer science The relationship of bivariate data ordinarily measured using correlation coefficient. The most commonly used correlation coefficient is the Pearson correlation coefficient. This coefficient is well-known as the best coefficient for interval or ratio bivariate data with a linear relationship. Even though this coefficient is good under the mentioned condition, it also becomes very sensitive to a small departure from linearity.Usually, this is because of the existence of an outlier. For that reason, this paper provides new robust correlation coefficients which combine the elements of nonparametric technique from the Hodges Lehmann estimator and the parametric technique based on the Pearson correlation coefficient. This paper also introduces different scale estimators such as median and median absolute deviation (MADn) and denoted by rHL(med) and rHL(MADn) respectively. The performance of the proposed correlation coefficients is measured by the coefficient values and these values are also being compared to the Pearson correlation coefficient and several existing robust correlation coefficients. The results show that the Pearson correlation coefficient (r) with no doubt is very good under perfect data condition, but with only 10% outliers, it not only give poor correlation value but turns the direction of the relationship to negative. While the rHL(med) and rHL(MADn) offer the highest coefficient values and these values are robust to the existence of outliers by up to 30%. With very good performance under all data conditions yet simple in the calculation, the rHL(med) and rHL(MADn) is considered a good alternative to the r when need to deal with outliers 2020 Article PeerReviewed application/pdf en https://repo.uum.edu.my/id/eprint/26970/1/ASM%20ScJ%2013%202020%201%205.pdf Abdullah, Suhaida and Zakaria, Nur Amira and Ahad, Nor Aishah and Yusof, Norhayati and Syed Yahaya, Sharipah Soaad (2020) Measuring the relationship of bivariate data using Hodges-Lehman estimator. ASM Science Journal, 13. pp. 1-5. ISSN 1823-6782 http://doi.org/10.32802/asmscj.2020.sm26(1.11) doi:10.32802/asmscj.2020.sm26(1.11) doi:10.32802/asmscj.2020.sm26(1.11)
spellingShingle QA75 Electronic computers. Computer science
Abdullah, Suhaida
Zakaria, Nur Amira
Ahad, Nor Aishah
Yusof, Norhayati
Syed Yahaya, Sharipah Soaad
Measuring the relationship of bivariate data using Hodges-Lehman estimator
title Measuring the relationship of bivariate data using Hodges-Lehman estimator
title_full Measuring the relationship of bivariate data using Hodges-Lehman estimator
title_fullStr Measuring the relationship of bivariate data using Hodges-Lehman estimator
title_full_unstemmed Measuring the relationship of bivariate data using Hodges-Lehman estimator
title_short Measuring the relationship of bivariate data using Hodges-Lehman estimator
title_sort measuring the relationship of bivariate data using hodges lehman estimator
topic QA75 Electronic computers. Computer science
url https://repo.uum.edu.my/id/eprint/26970/1/ASM%20ScJ%2013%202020%201%205.pdf
work_keys_str_mv AT abdullahsuhaida measuringtherelationshipofbivariatedatausinghodgeslehmanestimator
AT zakarianuramira measuringtherelationshipofbivariatedatausinghodgeslehmanestimator
AT ahadnoraishah measuringtherelationshipofbivariatedatausinghodgeslehmanestimator
AT yusofnorhayati measuringtherelationshipofbivariatedatausinghodgeslehmanestimator
AT syedyahayasharipahsoaad measuringtherelationshipofbivariatedatausinghodgeslehmanestimator