Health-Related Data Analysis Using Metaheuristic Optimization and Machine Learning
Health-related data has a decisive role in disease diagnosis. Collecting relevant information from health-related data in medical records has been facilitated by evaluating the features of the data. Relevant research has shown that outcomes are significantly impacted by the use of feature selection...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10500837/ |
_version_ | 1797193545117859840 |
---|---|
author | Annisa Darmawahyuni Siti Nurmaini Bambang Tutuko Muhammad Naufal Rachmatullah Firdaus Firdaus Ade Iriani Sapitri Anggun Islami Jordan Marcelino Rendy Isdwanta Muhammad Irfan Karim |
author_facet | Annisa Darmawahyuni Siti Nurmaini Bambang Tutuko Muhammad Naufal Rachmatullah Firdaus Firdaus Ade Iriani Sapitri Anggun Islami Jordan Marcelino Rendy Isdwanta Muhammad Irfan Karim |
author_sort | Annisa Darmawahyuni |
collection | DOAJ |
description | Health-related data has a decisive role in disease diagnosis. Collecting relevant information from health-related data in medical records has been facilitated by evaluating the features of the data. Relevant research has shown that outcomes are significantly impacted by the use of feature selection (FS) in different medical domain data. FS provides an analysis of the most significant features to improve classification accuracy. The FS technique aims at minimizing the number of input variables and computational overload to maximize classification performance results. However, identifying the optimal features poses issues due to the high dimensionality of large features and the small sample size of health-related data. The metaheuristics optimization algorithm (MOA) plays an important role in generating the best subset features with exploration and exploitation phases. This study experiments with well-known MOAs and supervised learning from the UC Irvine Machine Learning Repository, PhysioNet, Kent Ridge Bio-Medical Dataset, and MIMIC-III v1.4 Repository with varying feature dimensions. To increase the quality of health-related data, this study proposes missing data imputation based on a deep learning approach, an autoencoder (AE). With AE imputation, the performance results obtain 0.0167 mean squared error (MSE) and 0.129 root mean squared error (RMSE). As a result, MOA shows its excellence in achieving minimal features, but still outstanding performance in low- and high-dimensional data. MOA is successfully applied to varying diverse health-related datasets with low- and high-dimensional data. |
first_indexed | 2024-04-24T05:42:05Z |
format | Article |
id | doaj.art-e048322e5f854259bb76fc935ab175f7 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-24T05:42:05Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-e048322e5f854259bb76fc935ab175f72024-04-23T23:00:29ZengIEEEIEEE Access2169-35362024-01-0112553425535610.1109/ACCESS.2024.339000810500837Health-Related Data Analysis Using Metaheuristic Optimization and Machine LearningAnnisa Darmawahyuni0https://orcid.org/0000-0002-0229-5717Siti Nurmaini1https://orcid.org/0000-0002-8024-2952Bambang Tutuko2https://orcid.org/0000-0002-2051-8988Muhammad Naufal Rachmatullah3https://orcid.org/0000-0003-3553-3475Firdaus Firdaus4https://orcid.org/0000-0003-2791-3486Ade Iriani Sapitri5Anggun Islami6Jordan Marcelino7https://orcid.org/0009-0002-2499-396XRendy Isdwanta8https://orcid.org/0009-0009-2956-6175Muhammad Irfan Karim9Faculty of Engineering, Universitas Sriwijaya, Palembang, IndonesiaIntelligent System Research Group, Universitas Sriwijaya, Palembang, IndonesiaIntelligent System Research Group, Universitas Sriwijaya, Palembang, IndonesiaIntelligent System Research Group, Universitas Sriwijaya, Palembang, IndonesiaIntelligent System Research Group, Universitas Sriwijaya, Palembang, IndonesiaIntelligent System Research Group, Universitas Sriwijaya, Palembang, IndonesiaIntelligent System Research Group, Universitas Sriwijaya, Palembang, IndonesiaIntelligent System Research Group, Universitas Sriwijaya, Palembang, IndonesiaIntelligent System Research Group, Universitas Sriwijaya, Palembang, IndonesiaIntelligent System Research Group, Universitas Sriwijaya, Palembang, IndonesiaHealth-related data has a decisive role in disease diagnosis. Collecting relevant information from health-related data in medical records has been facilitated by evaluating the features of the data. Relevant research has shown that outcomes are significantly impacted by the use of feature selection (FS) in different medical domain data. FS provides an analysis of the most significant features to improve classification accuracy. The FS technique aims at minimizing the number of input variables and computational overload to maximize classification performance results. However, identifying the optimal features poses issues due to the high dimensionality of large features and the small sample size of health-related data. The metaheuristics optimization algorithm (MOA) plays an important role in generating the best subset features with exploration and exploitation phases. This study experiments with well-known MOAs and supervised learning from the UC Irvine Machine Learning Repository, PhysioNet, Kent Ridge Bio-Medical Dataset, and MIMIC-III v1.4 Repository with varying feature dimensions. To increase the quality of health-related data, this study proposes missing data imputation based on a deep learning approach, an autoencoder (AE). With AE imputation, the performance results obtain 0.0167 mean squared error (MSE) and 0.129 root mean squared error (RMSE). As a result, MOA shows its excellence in achieving minimal features, but still outstanding performance in low- and high-dimensional data. MOA is successfully applied to varying diverse health-related datasets with low- and high-dimensional data.https://ieeexplore.ieee.org/document/10500837/Autoencoderclassificationdata imputationfeature selectionhealth-related datasetmetaheuristic algorithms |
spellingShingle | Annisa Darmawahyuni Siti Nurmaini Bambang Tutuko Muhammad Naufal Rachmatullah Firdaus Firdaus Ade Iriani Sapitri Anggun Islami Jordan Marcelino Rendy Isdwanta Muhammad Irfan Karim Health-Related Data Analysis Using Metaheuristic Optimization and Machine Learning IEEE Access Autoencoder classification data imputation feature selection health-related dataset metaheuristic algorithms |
title | Health-Related Data Analysis Using Metaheuristic Optimization and Machine Learning |
title_full | Health-Related Data Analysis Using Metaheuristic Optimization and Machine Learning |
title_fullStr | Health-Related Data Analysis Using Metaheuristic Optimization and Machine Learning |
title_full_unstemmed | Health-Related Data Analysis Using Metaheuristic Optimization and Machine Learning |
title_short | Health-Related Data Analysis Using Metaheuristic Optimization and Machine Learning |
title_sort | health related data analysis using metaheuristic optimization and machine learning |
topic | Autoencoder classification data imputation feature selection health-related dataset metaheuristic algorithms |
url | https://ieeexplore.ieee.org/document/10500837/ |
work_keys_str_mv | AT annisadarmawahyuni healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning AT sitinurmaini healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning AT bambangtutuko healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning AT muhammadnaufalrachmatullah healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning AT firdausfirdaus healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning AT adeirianisapitri healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning AT anggunislami healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning AT jordanmarcelino healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning AT rendyisdwanta healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning AT muhammadirfankarim healthrelateddataanalysisusingmetaheuristicoptimizationandmachinelearning |