A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight

Clustering is an important data analysis technique, which has been applied to many practical scenarios. However, many partitioning based clustering algorithms are sensitive to the initial state of cluster centroids, may get trapped in a local optimum, and have poor robustness. In recent years, parti...

Full description

Bibliographic Details
Main Authors: Hanjie Gao, Yintong Li, Petr Kabalyants, Hao Xu, Rodrigo Martinez-Bejar
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9133522/
_version_ 1818330202737999872
author Hanjie Gao
Yintong Li
Petr Kabalyants
Hao Xu
Rodrigo Martinez-Bejar
author_facet Hanjie Gao
Yintong Li
Petr Kabalyants
Hao Xu
Rodrigo Martinez-Bejar
author_sort Hanjie Gao
collection DOAJ
description Clustering is an important data analysis technique, which has been applied to many practical scenarios. However, many partitioning based clustering algorithms are sensitive to the initial state of cluster centroids, may get trapped in a local optimum, and have poor robustness. In recent years, particle swarm optimization (PSO) has been regarded as an effective solution to the problem. However, it has the possibility of converging to a local optimum, especially when solving complex problems. In this paper, we propose a hybrid PSO-K-means algorithm, which uses the Gaussian estimation of distribution method (GEDM) to assist PSO in updating the population information and adopts Lévy flight to escape from the local optimum. The proposed algorithm is named a GEDM and Lévy flight based PSO-K-means (GLPSOK) clustering algorithm. Firstly, during initialization, a few particles are initialized using the cluster centroids generated by K-means, while other particles are randomly initialized in the search space. Secondly, GEDM and PSO are selected with different probability to update the population information at different optimization stages. Thirdly, Lévy flight is adopted to help the search escape from the local optimum. Finally, the greedy strategy is carried out to select the promising particles from the parents and the newly generated candidates. Experimental results on both synthetic data sets and real-world data sets show that the proposed algorithm can produce better clustering results and is more robust than existing classic or state-of-the-art clustering algorithms.
first_indexed 2024-12-13T13:00:13Z
format Article
id doaj.art-10c677c281024cdf84eae00069b4d4e6
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-13T13:00:13Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-10c677c281024cdf84eae00069b4d4e62022-12-21T23:45:03ZengIEEEIEEE Access2169-35362020-01-01812284812286310.1109/ACCESS.2020.30074989133522A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy FlightHanjie Gao0https://orcid.org/0000-0002-8892-0703Yintong Li1https://orcid.org/0000-0001-5304-5147Petr Kabalyants2https://orcid.org/0000-0003-0598-0880Hao Xu3https://orcid.org/0000-0001-8474-0767Rodrigo Martinez-Bejar4https://orcid.org/0000-0002-9677-7396College of Computer Science and Technology, Jilin University, Changchun, ChinaCollege of Aeronautics Engineering, Air Force Engineering University, Xi’an, ChinaMathematics and Computer Science School, V. N. Karazin Kharkiv National University, Kharkiv, UkraineCollege of Computer Science and Technology, Jilin University, Changchun, ChinaDepartment of Information Engineering and Communications, Faculty of Informatics, University of Murcia, Murcia, SpainClustering is an important data analysis technique, which has been applied to many practical scenarios. However, many partitioning based clustering algorithms are sensitive to the initial state of cluster centroids, may get trapped in a local optimum, and have poor robustness. In recent years, particle swarm optimization (PSO) has been regarded as an effective solution to the problem. However, it has the possibility of converging to a local optimum, especially when solving complex problems. In this paper, we propose a hybrid PSO-K-means algorithm, which uses the Gaussian estimation of distribution method (GEDM) to assist PSO in updating the population information and adopts Lévy flight to escape from the local optimum. The proposed algorithm is named a GEDM and Lévy flight based PSO-K-means (GLPSOK) clustering algorithm. Firstly, during initialization, a few particles are initialized using the cluster centroids generated by K-means, while other particles are randomly initialized in the search space. Secondly, GEDM and PSO are selected with different probability to update the population information at different optimization stages. Thirdly, Lévy flight is adopted to help the search escape from the local optimum. Finally, the greedy strategy is carried out to select the promising particles from the parents and the newly generated candidates. Experimental results on both synthetic data sets and real-world data sets show that the proposed algorithm can produce better clustering results and is more robust than existing classic or state-of-the-art clustering algorithms.https://ieeexplore.ieee.org/document/9133522/Data clusteringK-meansparticle swarm optimizationGaussian estimation of distribution methodLévy flight
spellingShingle Hanjie Gao
Yintong Li
Petr Kabalyants
Hao Xu
Rodrigo Martinez-Bejar
A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight
IEEE Access
Data clustering
K-means
particle swarm optimization
Gaussian estimation of distribution method
Lévy flight
title A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight
title_full A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight
title_fullStr A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight
title_full_unstemmed A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight
title_short A Novel Hybrid PSO-K-Means Clustering Algorithm Using Gaussian Estimation of Distribution Method and Lévy Flight
title_sort novel hybrid pso k means clustering algorithm using gaussian estimation of distribution method and l x00e9 vy flight
topic Data clustering
K-means
particle swarm optimization
Gaussian estimation of distribution method
Lévy flight
url https://ieeexplore.ieee.org/document/9133522/
work_keys_str_mv AT hanjiegao anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight
AT yintongli anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight
AT petrkabalyants anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight
AT haoxu anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight
AT rodrigomartinezbejar anovelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight
AT hanjiegao novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight
AT yintongli novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight
AT petrkabalyants novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight
AT haoxu novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight
AT rodrigomartinezbejar novelhybridpsokmeansclusteringalgorithmusinggaussianestimationofdistributionmethodandlx00e9vyflight