SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data

Data clustering is an unsupervised learning task that has been extensively studied, given its wide applicability in various domains. Traditional algorithms often struggle to achieve a balance between exploration and exploitation, leading to sub-optimal solutions. This paper presents a novel hybrid a...

Full description

Bibliographic Details
Main Authors: Amani Abdo, Omnia Abdelkader, Laila Abdel-Hamid
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10381719/
_version_ 1827369176263557120
author Amani Abdo
Omnia Abdelkader
Laila Abdel-Hamid
author_facet Amani Abdo
Omnia Abdelkader
Laila Abdel-Hamid
author_sort Amani Abdo
collection DOAJ
description Data clustering is an unsupervised learning task that has been extensively studied, given its wide applicability in various domains. Traditional algorithms often struggle to achieve a balance between exploration and exploitation, leading to sub-optimal solutions. This paper presents a novel hybrid algorithm named SA-PSO-GK++ that synergistically combines Particle Swarm Optimization (PSO), K-means++, Simulated Annealing (SA), and Gaussian Estimation of Distribution to tackle this issue effectively. The proposed SA-PSO-GK++ aims to overcome the drawbacks of existing methods by leveraging the strengths of each individual algorithm. The K-means++ initialization reduces the risk of poor initial centroids, while PSO aids in efficient search space exploration. GED provides a statistical model of the particle space, enabling the algorithm to generate new potential solutions that are statistically guided by the current best solutions. Additionally, the incorporation of Simulated Annealing allows the algorithm to escape local minima, thereby enhancing its global search capability. We evaluate the effectiveness of SA-PSO-GK++ using benchmark datasets from the UCI Machine Learning Repository, including the Iris, Breast cancer, Heart datasets and contraceptive method choice datasets. The proposed method outperforms conventional and some of the state-of-the-art hybrid clustering algorithms in terms of sum of euclidean distance, normalized index, and error rates. These advantages make SA-PSO-GK++ a compelling option for a wide range of clustering applications. The results offer promising avenues for future research in optimizing and applying this innovative clustering technique in diverse domains.
first_indexed 2024-03-08T09:43:27Z
format Article
id doaj.art-276b51470a29454386ad0ba4226b7087
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-08T09:43:27Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-276b51470a29454386ad0ba4226b70872024-01-30T00:01:31ZengIEEEIEEE Access2169-35362024-01-0112125011251610.1109/ACCESS.2024.335044210381719SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical DataAmani Abdo0https://orcid.org/0000-0002-4873-3065Omnia Abdelkader1https://orcid.org/0009-0001-0293-4646Laila Abdel-Hamid2https://orcid.org/0000-0002-7928-5680Faculty of Computing, Arab Open University, Cairo, EgyptFaculty of Computers and Artificial Intelligence, Helwan University (HU), Helwan, Cairo, EgyptFaculty of Computers and Artificial Intelligence, Helwan University (HU), Helwan, Cairo, EgyptData clustering is an unsupervised learning task that has been extensively studied, given its wide applicability in various domains. Traditional algorithms often struggle to achieve a balance between exploration and exploitation, leading to sub-optimal solutions. This paper presents a novel hybrid algorithm named SA-PSO-GK++ that synergistically combines Particle Swarm Optimization (PSO), K-means++, Simulated Annealing (SA), and Gaussian Estimation of Distribution to tackle this issue effectively. The proposed SA-PSO-GK++ aims to overcome the drawbacks of existing methods by leveraging the strengths of each individual algorithm. The K-means++ initialization reduces the risk of poor initial centroids, while PSO aids in efficient search space exploration. GED provides a statistical model of the particle space, enabling the algorithm to generate new potential solutions that are statistically guided by the current best solutions. Additionally, the incorporation of Simulated Annealing allows the algorithm to escape local minima, thereby enhancing its global search capability. We evaluate the effectiveness of SA-PSO-GK++ using benchmark datasets from the UCI Machine Learning Repository, including the Iris, Breast cancer, Heart datasets and contraceptive method choice datasets. The proposed method outperforms conventional and some of the state-of-the-art hybrid clustering algorithms in terms of sum of euclidean distance, normalized index, and error rates. These advantages make SA-PSO-GK++ a compelling option for a wide range of clustering applications. The results offer promising avenues for future research in optimizing and applying this innovative clustering technique in diverse domains.https://ieeexplore.ieee.org/document/10381719/Swarm intelligenceparticle swarm optimization (PSO)K-meansK-means++simulated annealingGaussian estimation
spellingShingle Amani Abdo
Omnia Abdelkader
Laila Abdel-Hamid
SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data
IEEE Access
Swarm intelligence
particle swarm optimization (PSO)
K-means
K-means++
simulated annealing
Gaussian estimation
title SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data
title_full SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data
title_fullStr SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data
title_full_unstemmed SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data
title_short SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data
title_sort sa pso gk x002b x002b a new hybrid clustering approach for analyzing medical data
topic Swarm intelligence
particle swarm optimization (PSO)
K-means
K-means++
simulated annealing
Gaussian estimation
url https://ieeexplore.ieee.org/document/10381719/
work_keys_str_mv AT amaniabdo sapsogkx002bx002banewhybridclusteringapproachforanalyzingmedicaldata
AT omniaabdelkader sapsogkx002bx002banewhybridclusteringapproachforanalyzingmedicaldata
AT lailaabdelhamid sapsogkx002bx002banewhybridclusteringapproachforanalyzingmedicaldata