A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight
Clustering is an important data analysis technique, which has been applied to many practical scenarios. However, many partitioning based clustering algorithms are sensitive to the initial state of cluster centroids, may get trapped in a local optimum, and have poor robustness. In recent years, parti...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9133522/ |
_version_ | 1818330202737999872 |
---|---|
author | Hanjie Gao Yintong Li Petr Kabalyants Hao Xu Rodrigo Martinez-Bejar |
author_facet | Hanjie Gao Yintong Li Petr Kabalyants Hao Xu Rodrigo Martinez-Bejar |
author_sort | Hanjie Gao |
collection | DOAJ |
description | Clustering is an important data analysis technique, which has been applied to many practical scenarios. However, many partitioning based clustering algorithms are sensitive to the initial state of cluster centroids, may get trapped in a local optimum, and have poor robustness. In recent years, particle swarm optimization (PSO) has been regarded as an effective solution to the problem. However, it has the possibility of converging to a local optimum, especially when solving complex problems. In this paper, we propose a hybrid PSO-K-means algorithm, which uses the Gaussian estimation of distribution method (GEDM) to assist PSO in updating the population information and adopts Lévy flight to escape from the local optimum. The proposed algorithm is named a GEDM and Lévy flight based PSO-K-means (GLPSOK) clustering algorithm. Firstly, during initialization, a few particles are initialized using the cluster centroids generated by K-means, while other particles are randomly initialized in the search space. Secondly, GEDM and PSO are selected with different probability to update the population information at different optimization stages. Thirdly, Lévy flight is adopted to help the search escape from the local optimum. Finally, the greedy strategy is carried out to select the promising particles from the parents and the newly generated candidates. Experimental results on both synthetic data sets and real-world data sets show that the proposed algorithm can produce better clustering results and is more robust than existing classic or state-of-the-art clustering algorithms. |
first_indexed | 2024-12-13T13:00:13Z |
format | Article |
id | doaj.art-10c677c281024cdf84eae00069b4d4e6 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-13T13:00:13Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-10c677c281024cdf84eae00069b4d4e62022-12-21T23:45:03ZengIEEEIEEE Access2169-35362020-01-01812284812286310.1109/ACCESS.2020.30074989133522A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy FlightHanjie Gao0https://orcid.org/0000-0002-8892-0703Yintong Li1https://orcid.org/0000-0001-5304-5147Petr Kabalyants2https://orcid.org/0000-0003-0598-0880Hao Xu3https://orcid.org/0000-0001-8474-0767Rodrigo Martinez-Bejar4https://orcid.org/0000-0002-9677-7396College of Computer Science and Technology, Jilin University, Changchun, ChinaCollege of Aeronautics Engineering, Air Force Engineering University, Xi’an, ChinaMathematics and Computer Science School, V. N. Karazin Kharkiv National University, Kharkiv, UkraineCollege of Computer Science and Technology, Jilin University, Changchun, ChinaDepartment of Information Engineering and Communications, Faculty of Informatics, University of Murcia, Murcia, SpainClustering is an important data analysis technique, which has been applied to many practical scenarios. However, many partitioning based clustering algorithms are sensitive to the initial state of cluster centroids, may get trapped in a local optimum, and have poor robustness. In recent years, particle swarm optimization (PSO) has been regarded as an effective solution to the problem. However, it has the possibility of converging to a local optimum, especially when solving complex problems. In this paper, we propose a hybrid PSO-K-means algorithm, which uses the Gaussian estimation of distribution method (GEDM) to assist PSO in updating the population information and adopts Lévy flight to escape from the local optimum. The proposed algorithm is named a GEDM and Lévy flight based PSO-K-means (GLPSOK) clustering algorithm. Firstly, during initialization, a few particles are initialized using the cluster centroids generated by K-means, while other particles are randomly initialized in the search space. Secondly, GEDM and PSO are selected with different probability to update the population information at different optimization stages. Thirdly, Lévy flight is adopted to help the search escape from the local optimum. Finally, the greedy strategy is carried out to select the promising particles from the parents and the newly generated candidates. Experimental results on both synthetic data sets and real-world data sets show that the proposed algorithm can produce better clustering results and is more robust than existing classic or state-of-the-art clustering algorithms.https://ieeexplore.ieee.org/document/9133522/Data clusteringK-meansparticle swarm optimizationGaussian estimation of distribution methodLévy flight |
spellingShingle | Hanjie Gao Yintong Li Petr Kabalyants Hao Xu Rodrigo Martinez-Bejar A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight IEEE Access Data clustering K-means particle swarm optimization Gaussian estimation of distribution method Lévy flight |
title | A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight |
title_full | A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight |
title_fullStr | A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight |
title_full_unstemmed | A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight |
title_short | A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight |
title_sort | novel hybrid pso k means clustering algorithm using gaussian estimation of distribution method and l x00e9 vy flight |
topic | Data clustering K-means particle swarm optimization Gaussian estimation of distribution method Lévy flight |
url | https://ieeexplore.ieee.org/document/9133522/ |
work_keys_str_mv | AT hanjiegao anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight AT yintongli anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight AT petrkabalyants anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight AT haoxu anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight AT rodrigomartinezbejar anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight AT hanjiegao novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight AT yintongli novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight AT petrkabalyants novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight AT haoxu novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight AT rodrigomartinezbejar novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight |