Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition
Human activity recognition (HAR) is a popular field of study. The outcomes of the projects in this area have the potential to impact on the quality of life of people with conditions such as dementia. HAR is focused primarily on applying machine learning classifiers on data from low level sensors suc...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-03-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/20/7/1858 |
_version_ | 1797626555628781568 |
---|---|
author | Dionicio Neira-Rodado Chris Nugent Ian Cleland Javier Velasquez Amelec Viloria |
author_facet | Dionicio Neira-Rodado Chris Nugent Ian Cleland Javier Velasquez Amelec Viloria |
author_sort | Dionicio Neira-Rodado |
collection | DOAJ |
description | Human activity recognition (HAR) is a popular field of study. The outcomes of the projects in this area have the potential to impact on the quality of life of people with conditions such as dementia. HAR is focused primarily on applying machine learning classifiers on data from low level sensors such as accelerometers. The performance of these classifiers can be improved through an adequate training process. In order to improve the training process, multivariate outlier detection was used in order to improve the quality of data in the training set and, subsequently, performance of the classifier. The impact of the technique was evaluated with KNN and random forest (RF) classifiers. In the case of KNN, the performance of the classifier was improved from 55.9% to 63.59%. |
first_indexed | 2024-03-11T10:12:04Z |
format | Article |
id | doaj.art-b3c4123ce4c1436b98f0ff530aa84824 |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-03-11T10:12:04Z |
publishDate | 2020-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-b3c4123ce4c1436b98f0ff530aa848242023-11-16T14:28:03ZengMDPI AGSensors1424-82202020-03-01207185810.3390/s20071858Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity RecognitionDionicio Neira-Rodado0Chris Nugent1Ian Cleland2Javier Velasquez3Amelec Viloria4Department of Industrial Agroindustrial and Operations Management GIAO, Universidad de la Costa, Barranquilla 080002, ColombiaSchool of Computing, Ulster University, Shore Road, Newtownabbey, County Antrim BT37 0QB, Northern Ireland, UKSchool of Computing, Ulster University, Shore Road, Newtownabbey, County Antrim BT37 0QB, Northern Ireland, UKDepartment of Industrial Agroindustrial and Operations Management GIAO, Universidad de la Costa, Barranquilla 080002, ColombiaDepartment of Industrial Agroindustrial and Operations Management GIAO, Universidad de la Costa, Barranquilla 080002, ColombiaHuman activity recognition (HAR) is a popular field of study. The outcomes of the projects in this area have the potential to impact on the quality of life of people with conditions such as dementia. HAR is focused primarily on applying machine learning classifiers on data from low level sensors such as accelerometers. The performance of these classifiers can be improved through an adequate training process. In order to improve the training process, multivariate outlier detection was used in order to improve the quality of data in the training set and, subsequently, performance of the classifier. The impact of the technique was evaluated with KNN and random forest (RF) classifiers. In the case of KNN, the performance of the classifier was improved from 55.9% to 63.59%.https://www.mdpi.com/1424-8220/20/7/1858multivariate analysisHARmachine learningdataset quality |
spellingShingle | Dionicio Neira-Rodado Chris Nugent Ian Cleland Javier Velasquez Amelec Viloria Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition Sensors multivariate analysis HAR machine learning dataset quality |
title | Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition |
title_full | Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition |
title_fullStr | Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition |
title_full_unstemmed | Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition |
title_short | Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition |
title_sort | evaluating the impact of a two stage multivariate data cleansing approach to improve to the performance of machine learning classifiers a case study in human activity recognition |
topic | multivariate analysis HAR machine learning dataset quality |
url | https://www.mdpi.com/1424-8220/20/7/1858 |
work_keys_str_mv | AT dionicioneirarodado evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition AT chrisnugent evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition AT iancleland evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition AT javiervelasquez evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition AT amelecviloria evaluatingtheimpactofatwostagemultivariatedatacleansingapproachtoimprovetotheperformanceofmachinelearningclassifiersacasestudyinhumanactivityrecognition |