SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data
Data clustering is an unsupervised learning task that has been extensively studied, given its wide applicability in various domains. Traditional algorithms often struggle to achieve a balance between exploration and exploitation, leading to sub-optimal solutions. This paper presents a novel hybrid a...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10381719/ |
_version_ | 1827369176263557120 |
---|---|
author | Amani Abdo Omnia Abdelkader Laila Abdel-Hamid |
author_facet | Amani Abdo Omnia Abdelkader Laila Abdel-Hamid |
author_sort | Amani Abdo |
collection | DOAJ |
description | Data clustering is an unsupervised learning task that has been extensively studied, given its wide applicability in various domains. Traditional algorithms often struggle to achieve a balance between exploration and exploitation, leading to sub-optimal solutions. This paper presents a novel hybrid algorithm named SA-PSO-GK++ that synergistically combines Particle Swarm Optimization (PSO), K-means++, Simulated Annealing (SA), and Gaussian Estimation of Distribution to tackle this issue effectively. The proposed SA-PSO-GK++ aims to overcome the drawbacks of existing methods by leveraging the strengths of each individual algorithm. The K-means++ initialization reduces the risk of poor initial centroids, while PSO aids in efficient search space exploration. GED provides a statistical model of the particle space, enabling the algorithm to generate new potential solutions that are statistically guided by the current best solutions. Additionally, the incorporation of Simulated Annealing allows the algorithm to escape local minima, thereby enhancing its global search capability. We evaluate the effectiveness of SA-PSO-GK++ using benchmark datasets from the UCI Machine Learning Repository, including the Iris, Breast cancer, Heart datasets and contraceptive method choice datasets. The proposed method outperforms conventional and some of the state-of-the-art hybrid clustering algorithms in terms of sum of euclidean distance, normalized index, and error rates. These advantages make SA-PSO-GK++ a compelling option for a wide range of clustering applications. The results offer promising avenues for future research in optimizing and applying this innovative clustering technique in diverse domains. |
first_indexed | 2024-03-08T09:43:27Z |
format | Article |
id | doaj.art-276b51470a29454386ad0ba4226b7087 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-08T09:43:27Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-276b51470a29454386ad0ba4226b70872024-01-30T00:01:31ZengIEEEIEEE Access2169-35362024-01-0112125011251610.1109/ACCESS.2024.335044210381719SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical DataAmani Abdo0https://orcid.org/0000-0002-4873-3065Omnia Abdelkader1https://orcid.org/0009-0001-0293-4646Laila Abdel-Hamid2https://orcid.org/0000-0002-7928-5680Faculty of Computing, Arab Open University, Cairo, EgyptFaculty of Computers and Artificial Intelligence, Helwan University (HU), Helwan, Cairo, EgyptFaculty of Computers and Artificial Intelligence, Helwan University (HU), Helwan, Cairo, EgyptData clustering is an unsupervised learning task that has been extensively studied, given its wide applicability in various domains. Traditional algorithms often struggle to achieve a balance between exploration and exploitation, leading to sub-optimal solutions. This paper presents a novel hybrid algorithm named SA-PSO-GK++ that synergistically combines Particle Swarm Optimization (PSO), K-means++, Simulated Annealing (SA), and Gaussian Estimation of Distribution to tackle this issue effectively. The proposed SA-PSO-GK++ aims to overcome the drawbacks of existing methods by leveraging the strengths of each individual algorithm. The K-means++ initialization reduces the risk of poor initial centroids, while PSO aids in efficient search space exploration. GED provides a statistical model of the particle space, enabling the algorithm to generate new potential solutions that are statistically guided by the current best solutions. Additionally, the incorporation of Simulated Annealing allows the algorithm to escape local minima, thereby enhancing its global search capability. We evaluate the effectiveness of SA-PSO-GK++ using benchmark datasets from the UCI Machine Learning Repository, including the Iris, Breast cancer, Heart datasets and contraceptive method choice datasets. The proposed method outperforms conventional and some of the state-of-the-art hybrid clustering algorithms in terms of sum of euclidean distance, normalized index, and error rates. These advantages make SA-PSO-GK++ a compelling option for a wide range of clustering applications. The results offer promising avenues for future research in optimizing and applying this innovative clustering technique in diverse domains.https://ieeexplore.ieee.org/document/10381719/Swarm intelligenceparticle swarm optimization (PSO)K-meansK-means++simulated annealingGaussian estimation |
spellingShingle | Amani Abdo Omnia Abdelkader Laila Abdel-Hamid SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data IEEE Access Swarm intelligence particle swarm optimization (PSO) K-means K-means++ simulated annealing Gaussian estimation |
title | SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data |
title_full | SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data |
title_fullStr | SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data |
title_full_unstemmed | SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data |
title_short | SA-PSO-GK++: A New Hybrid Clustering Approach for Analyzing Medical Data |
title_sort | sa pso gk x002b x002b a new hybrid clustering approach for analyzing medical data |
topic | Swarm intelligence particle swarm optimization (PSO) K-means K-means++ simulated annealing Gaussian estimation |
url | https://ieeexplore.ieee.org/document/10381719/ |
work_keys_str_mv | AT amaniabdo sapsogkx002bx002banewhybridclusteringapproachforanalyzingmedicaldata AT omniaabdelkader sapsogkx002bx002banewhybridclusteringapproachforanalyzingmedicaldata AT lailaabdelhamid sapsogkx002bx002banewhybridclusteringapproachforanalyzingmedicaldata |