Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition

Human activity recognition (HAR) is a popular field of study. The outcomes of the projects in this area have the potential to impact on the quality of life of people with conditions such as dementia. HAR is focused primarily on applying machine learning classifiers on data from low level sensors suc...

Full description

Bibliographic Details
Main Authors: Dionicio Neira-Rodado, Chris Nugent, Ian Cleland, Javier Velasquez, Amelec Viloria
Format: Article
Language:English
Published: MDPI AG 2020-03-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/20/7/1858
_version_ 1797626555628781568
author Dionicio Neira-Rodado
Chris Nugent
Ian Cleland
Javier Velasquez
Amelec Viloria
author_facet Dionicio Neira-Rodado
Chris Nugent
Ian Cleland
Javier Velasquez
Amelec Viloria
author_sort Dionicio Neira-Rodado
collection DOAJ
description Human activity recognition (HAR) is a popular field of study. The outcomes of the projects in this area have the potential to impact on the quality of life of people with conditions such as dementia. HAR is focused primarily on applying machine learning classifiers on data from low level sensors such as accelerometers. The performance of these classifiers can be improved through an adequate training process. In order to improve the training process, multivariate outlier detection was used in order to improve the quality of data in the training set and, subsequently, performance of the classifier. The impact of the technique was evaluated with KNN and random forest (RF) classifiers. In the case of KNN, the performance of the classifier was improved from 55.9% to 63.59%.
first_indexed 2024-03-11T10:12:04Z
format Article
id doaj.art-b3c4123ce4c1436b98f0ff530aa84824
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-11T10:12:04Z
publishDate 2020-03-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-b3c4123ce4c1436b98f0ff530aa848242023-11-16T14:28:03ZengMDPI AGSensors1424-82202020-03-01207185810.3390/s20071858Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity RecognitionDionicio Neira-Rodado0Chris Nugent1Ian Cleland2Javier Velasquez3Amelec Viloria4Department of Industrial Agroindustrial and Operations Management GIAO, Universidad de la Costa, Barranquilla 080002, ColombiaSchool of Computing, Ulster University, Shore Road, Newtownabbey, County Antrim BT37 0QB, Northern Ireland, UKSchool of Computing, Ulster University, Shore Road, Newtownabbey, County Antrim BT37 0QB, Northern Ireland, UKDepartment of Industrial Agroindustrial and Operations Management GIAO, Universidad de la Costa, Barranquilla 080002, ColombiaDepartment of Industrial Agroindustrial and Operations Management GIAO, Universidad de la Costa, Barranquilla 080002, ColombiaHuman activity recognition (HAR) is a popular field of study. The outcomes of the projects in this area have the potential to impact on the quality of life of people with conditions such as dementia. HAR is focused primarily on applying machine learning classifiers on data from low level sensors such as accelerometers. The performance of these classifiers can be improved through an adequate training process. In order to improve the training process, multivariate outlier detection was used in order to improve the quality of data in the training set and, subsequently, performance of the classifier. The impact of the technique was evaluated with KNN and random forest (RF) classifiers. In the case of KNN, the performance of the classifier was improved from 55.9% to 63.59%.https://www.mdpi.com/1424-8220/20/7/1858multivariate analysisHARmachine learningdataset quality
spellingShingle Dionicio Neira-Rodado
Chris Nugent
Ian Cleland
Javier Velasquez
Amelec Viloria
Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition
Sensors
multivariate analysis
HAR
machine learning
dataset quality
title Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition
title_full Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition
title_fullStr Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition
title_full_unstemmed Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition
title_short Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition
title_sort evaluating the impact of a two stage multivariate data cleansing approach to improve to the performance of machine learning classifiers a case study in human activity recognition
topic multivariate analysis
HAR
machine learning
dataset quality
url https://www.mdpi.com/1424-8220/20/7/1858
work_keys_str_mv AT dionicioneirarodado evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition
AT chrisnugent evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition
AT iancleland evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition
AT javiervelasquez evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition
AT amelecviloria evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition