Detection of outliers in high dimensional data using NU-support vector regression

Support Vector Regression (SVR) is gaining in popularity in the detection of outliers and classification problems in high-dimensional data (HDD) as this technique does not require the data to be of full rank. In real application, most of the data are of high dimensional. Classification of high-dimen...

Full description

Bibliographic Details
Main Authors: Mohammed Rashid, Abdullah, Midi, Habshah, Waleed Dhhan, Arasan, Jayanthi
Format: Article
Published: Taylor and Francis 2021
_version_ 1796983988842135552
author Mohammed Rashid, Abdullah
Midi, Habshah
Waleed Dhhan
Arasan, Jayanthi
author_facet Mohammed Rashid, Abdullah
Midi, Habshah
Waleed Dhhan
Arasan, Jayanthi
author_sort Mohammed Rashid, Abdullah
collection UPM
description Support Vector Regression (SVR) is gaining in popularity in the detection of outliers and classification problems in high-dimensional data (HDD) as this technique does not require the data to be of full rank. In real application, most of the data are of high dimensional. Classification of high-dimensional data is needed in applied sciences, in particular, as it is important to discriminate cancerous cells from non-cancerous cells. It is also imperative that outliers are identified before constructing a model on the relationship between the dependent and independent variables to avoid misleading interpretations about the fitting of a model. The standard SVR and the μ-ε-SVR are able to detect outliers; however, they are computationally expensive. The fixed parameters support vector regression (FP-ε-SVR) was put forward to remedy this issue. However, the FP-ε-SVR using ε-SVR is not very successful in identifying outliers. In this article, we propose an alternative method to detect outliers i.e. by employing nu-SVR. The merit of our proposed method is confirmed by three real examples and the Monte Carlo simulation. The results show that our proposed nu-SVR method is very successful in identifying outliers under a variety of situations, and with less computational running time.
first_indexed 2024-03-06T11:14:10Z
format Article
id upm.eprints-100919
institution Universiti Putra Malaysia
last_indexed 2024-03-06T11:14:10Z
publishDate 2021
publisher Taylor and Francis
record_format dspace
spelling upm.eprints-1009192023-07-14T08:55:49Z http://psasir.upm.edu.my/id/eprint/100919/ Detection of outliers in high dimensional data using NU-support vector regression Mohammed Rashid, Abdullah Midi, Habshah Waleed Dhhan Arasan, Jayanthi Support Vector Regression (SVR) is gaining in popularity in the detection of outliers and classification problems in high-dimensional data (HDD) as this technique does not require the data to be of full rank. In real application, most of the data are of high dimensional. Classification of high-dimensional data is needed in applied sciences, in particular, as it is important to discriminate cancerous cells from non-cancerous cells. It is also imperative that outliers are identified before constructing a model on the relationship between the dependent and independent variables to avoid misleading interpretations about the fitting of a model. The standard SVR and the μ-ε-SVR are able to detect outliers; however, they are computationally expensive. The fixed parameters support vector regression (FP-ε-SVR) was put forward to remedy this issue. However, the FP-ε-SVR using ε-SVR is not very successful in identifying outliers. In this article, we propose an alternative method to detect outliers i.e. by employing nu-SVR. The merit of our proposed method is confirmed by three real examples and the Monte Carlo simulation. The results show that our proposed nu-SVR method is very successful in identifying outliers under a variety of situations, and with less computational running time. Taylor and Francis 2021-04-08 Article PeerReviewed Mohammed Rashid, Abdullah and Midi, Habshah and Waleed Dhhan and Arasan, Jayanthi (2021) Detection of outliers in high dimensional data using NU-support vector regression. Journal of Applied Statistics, 49 (10). pp. 2550-2569. ISSN 0266-4763; ESSN: 1360-0532 https://www.tandfonline.com/doi/abs/10.1080/02664763.2021.1911965?journalCode=cjas20 10.1080/02664763.2021.1911965
spellingShingle Mohammed Rashid, Abdullah
Midi, Habshah
Waleed Dhhan
Arasan, Jayanthi
Detection of outliers in high dimensional data using NU-support vector regression
title Detection of outliers in high dimensional data using NU-support vector regression
title_full Detection of outliers in high dimensional data using NU-support vector regression
title_fullStr Detection of outliers in high dimensional data using NU-support vector regression
title_full_unstemmed Detection of outliers in high dimensional data using NU-support vector regression
title_short Detection of outliers in high dimensional data using NU-support vector regression
title_sort detection of outliers in high dimensional data using nu support vector regression
work_keys_str_mv AT mohammedrashidabdullah detectionofoutliersinhighdimensionaldatausingnusupportvectorregression
AT midihabshah detectionofoutliersinhighdimensionaldatausingnusupportvectorregression
AT waleeddhhan detectionofoutliersinhighdimensionaldatausingnusupportvectorregression
AT arasanjayanthi detectionofoutliersinhighdimensionaldatausingnusupportvectorregression