K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions

K-means clustering algorithm is a partitional clustering algorithm that has been used widely in many applications for traditional clustering due to its simplicity and low computational complexity. This clustering technique depends on the user specification of the number of clusters generated from th...

Full description

Bibliographic Details
Main Authors: Abiodun M. Ikotun, Mubarak S. Almutari, Absalom E. Ezugwu
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/23/11246
_version_ 1827675099930558464
author Abiodun M. Ikotun
Mubarak S. Almutari
Absalom E. Ezugwu
author_facet Abiodun M. Ikotun
Mubarak S. Almutari
Absalom E. Ezugwu
author_sort Abiodun M. Ikotun
collection DOAJ
description K-means clustering algorithm is a partitional clustering algorithm that has been used widely in many applications for traditional clustering due to its simplicity and low computational complexity. This clustering technique depends on the user specification of the number of clusters generated from the dataset, which affects the clustering results. Moreover, random initialization of cluster centers results in its local minimal convergence. Automatic clustering is a recent approach to clustering where the specification of cluster number is not required. In automatic clustering, natural clusters existing in datasets are identified without any background information of the data objects. Nature-inspired metaheuristic optimization algorithms have been deployed in recent times to overcome the challenges of the traditional clustering algorithm in handling automatic data clustering. Some nature-inspired metaheuristics algorithms have been hybridized with the traditional K-means algorithm to boost its performance and capability to handle automatic data clustering problems. This study aims to identify, retrieve, summarize, and analyze recently proposed studies related to the improvements of the K-means clustering algorithm with nature-inspired optimization techniques. A quest approach for article selection was adopted, which led to the identification and selection of 147 related studies from different reputable academic avenues and databases. More so, the analysis revealed that although the K-means algorithm has been well researched in the literature, its superiority over several well-established state-of-the-art clustering algorithms in terms of speed, accessibility, simplicity of use, and applicability to solve clustering problems with unlabeled and nonlinearly separable datasets has been clearly observed in the study. The current study also evaluated and discussed some of the well-known weaknesses of the K-means clustering algorithm, for which the existing improvement methods were conceptualized. It is noteworthy to mention that the current systematic review and analysis of existing literature on K-means enhancement approaches presents possible perspectives in the clustering analysis research domain and serves as a comprehensive source of information regarding the K-means algorithm and its variants for the research community.
first_indexed 2024-03-10T04:57:25Z
format Article
id doaj.art-96831dd5adf544b5a783982dd1f5bb67
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T04:57:25Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-96831dd5adf544b5a783982dd1f5bb672023-11-23T02:04:55ZengMDPI AGApplied Sciences2076-34172021-11-0111231124610.3390/app112311246K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future DirectionsAbiodun M. Ikotun0Mubarak S. Almutari1Absalom E. Ezugwu2School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Road, Pietermaritzburg 3201, South AfricaCollege of Computer Science, University of Hafr Al Batin, Hafar Al Batin 39524, Saudi ArabiaSchool of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Road, Pietermaritzburg 3201, South AfricaK-means clustering algorithm is a partitional clustering algorithm that has been used widely in many applications for traditional clustering due to its simplicity and low computational complexity. This clustering technique depends on the user specification of the number of clusters generated from the dataset, which affects the clustering results. Moreover, random initialization of cluster centers results in its local minimal convergence. Automatic clustering is a recent approach to clustering where the specification of cluster number is not required. In automatic clustering, natural clusters existing in datasets are identified without any background information of the data objects. Nature-inspired metaheuristic optimization algorithms have been deployed in recent times to overcome the challenges of the traditional clustering algorithm in handling automatic data clustering. Some nature-inspired metaheuristics algorithms have been hybridized with the traditional K-means algorithm to boost its performance and capability to handle automatic data clustering problems. This study aims to identify, retrieve, summarize, and analyze recently proposed studies related to the improvements of the K-means clustering algorithm with nature-inspired optimization techniques. A quest approach for article selection was adopted, which led to the identification and selection of 147 related studies from different reputable academic avenues and databases. More so, the analysis revealed that although the K-means algorithm has been well researched in the literature, its superiority over several well-established state-of-the-art clustering algorithms in terms of speed, accessibility, simplicity of use, and applicability to solve clustering problems with unlabeled and nonlinearly separable datasets has been clearly observed in the study. The current study also evaluated and discussed some of the well-known weaknesses of the K-means clustering algorithm, for which the existing improvement methods were conceptualized. It is noteworthy to mention that the current systematic review and analysis of existing literature on K-means enhancement approaches presents possible perspectives in the clustering analysis research domain and serves as a comprehensive source of information regarding the K-means algorithm and its variants for the research community.https://www.mdpi.com/2076-3417/11/23/11246K-means clusteringautomatic clusteringnature-inspired metaheuristic algorithmscluster analysis
spellingShingle Abiodun M. Ikotun
Mubarak S. Almutari
Absalom E. Ezugwu
K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions
Applied Sciences
K-means clustering
automatic clustering
nature-inspired metaheuristic algorithms
cluster analysis
title K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions
title_full K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions
title_fullStr K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions
title_full_unstemmed K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions
title_short K-Means-Based Nature-Inspired Metaheuristic Algorithms for Automatic Data Clustering Problems: Recent Advances and Future Directions
title_sort k means based nature inspired metaheuristic algorithms for automatic data clustering problems recent advances and future directions
topic K-means clustering
automatic clustering
nature-inspired metaheuristic algorithms
cluster analysis
url https://www.mdpi.com/2076-3417/11/23/11246
work_keys_str_mv AT abiodunmikotun kmeansbasednatureinspiredmetaheuristicalgorithmsforautomaticdataclusteringproblemsrecentadvancesandfuturedirections
AT mubaraksalmutari kmeansbasednatureinspiredmetaheuristicalgorithmsforautomaticdataclusteringproblemsrecentadvancesandfuturedirections
AT absalomeezugwu kmeansbasednatureinspiredmetaheuristicalgorithmsforautomaticdataclusteringproblemsrecentadvancesandfuturedirections