Adaptive Initialization Method for K-Means Algorithm
The K-means algorithm is a widely used clustering algorithm that offers simplicity and efficiency. However, the traditional K-means algorithm uses a random method to determine the initial cluster centers, which make clustering results prone to local optima and then result in worse clustering perform...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2021-11-01
|
Series: | Frontiers in Artificial Intelligence |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/frai.2021.740817/full |
_version_ | 1831569642204692480 |
---|---|
author | Jie Yang Yu-Kai Wang Xin Yao Xin Yao Chin-Teng Lin |
author_facet | Jie Yang Yu-Kai Wang Xin Yao Xin Yao Chin-Teng Lin |
author_sort | Jie Yang |
collection | DOAJ |
description | The K-means algorithm is a widely used clustering algorithm that offers simplicity and efficiency. However, the traditional K-means algorithm uses a random method to determine the initial cluster centers, which make clustering results prone to local optima and then result in worse clustering performance. In this research, we propose an adaptive initialization method for the K-means algorithm (AIMK) which can adapt to the various characteristics in different datasets and obtain better clustering performance with stable results. For larger or higher-dimensional datasets, we even leverage random sampling in AIMK (name as AIMK-RS) to reduce the time complexity. 22 real-world datasets were applied for performance comparisons. The experimental results show AIMK and AIMK-RS outperform the current initialization methods and several well-known clustering algorithms. Specifically, AIMK-RS can significantly reduce the time complexity to O (n). Moreover, we exploit AIMK to initialize K-medoids and spectral clustering, and better performance is also explored. The above results demonstrate superior performance and good scalability by AIMK or AIMK-RS. In the future, we would like to apply AIMK to more partition-based clustering algorithms to solve real-life practical problems. |
first_indexed | 2024-12-17T12:37:01Z |
format | Article |
id | doaj.art-6bbbbdc684c34de5be854a7debfdb9da |
institution | Directory Open Access Journal |
issn | 2624-8212 |
language | English |
last_indexed | 2024-12-17T12:37:01Z |
publishDate | 2021-11-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Artificial Intelligence |
spelling | doaj.art-6bbbbdc684c34de5be854a7debfdb9da2022-12-21T21:48:13ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122021-11-01410.3389/frai.2021.740817740817Adaptive Initialization Method for K-Means AlgorithmJie Yang0Yu-Kai Wang1Xin Yao2Xin Yao3Chin-Teng Lin4Computational Intelligence and Brain Computer Interface Lab, Australian Artificial Intelligence Institute, FEIT, University of Technology Sydney, Sydney, NSW, AustraliaComputational Intelligence and Brain Computer Interface Lab, Australian Artificial Intelligence Institute, FEIT, University of Technology Sydney, Sydney, NSW, AustraliaShenzhen Key Laboratory of Computational Intelligence, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, ChinaCERCIA, School of Computer Science, University of Birmingham, Birmingham, United KingdomComputational Intelligence and Brain Computer Interface Lab, Australian Artificial Intelligence Institute, FEIT, University of Technology Sydney, Sydney, NSW, AustraliaThe K-means algorithm is a widely used clustering algorithm that offers simplicity and efficiency. However, the traditional K-means algorithm uses a random method to determine the initial cluster centers, which make clustering results prone to local optima and then result in worse clustering performance. In this research, we propose an adaptive initialization method for the K-means algorithm (AIMK) which can adapt to the various characteristics in different datasets and obtain better clustering performance with stable results. For larger or higher-dimensional datasets, we even leverage random sampling in AIMK (name as AIMK-RS) to reduce the time complexity. 22 real-world datasets were applied for performance comparisons. The experimental results show AIMK and AIMK-RS outperform the current initialization methods and several well-known clustering algorithms. Specifically, AIMK-RS can significantly reduce the time complexity to O (n). Moreover, we exploit AIMK to initialize K-medoids and spectral clustering, and better performance is also explored. The above results demonstrate superior performance and good scalability by AIMK or AIMK-RS. In the future, we would like to apply AIMK to more partition-based clustering algorithms to solve real-life practical problems.https://www.frontiersin.org/articles/10.3389/frai.2021.740817/fullk-meansadaptiveinitialization methodinitial cluster centersclustering |
spellingShingle | Jie Yang Yu-Kai Wang Xin Yao Xin Yao Chin-Teng Lin Adaptive Initialization Method for K-Means Algorithm Frontiers in Artificial Intelligence k-means adaptive initialization method initial cluster centers clustering |
title | Adaptive Initialization Method for K-Means Algorithm |
title_full | Adaptive Initialization Method for K-Means Algorithm |
title_fullStr | Adaptive Initialization Method for K-Means Algorithm |
title_full_unstemmed | Adaptive Initialization Method for K-Means Algorithm |
title_short | Adaptive Initialization Method for K-Means Algorithm |
title_sort | adaptive initialization method for k means algorithm |
topic | k-means adaptive initialization method initial cluster centers clustering |
url | https://www.frontiersin.org/articles/10.3389/frai.2021.740817/full |
work_keys_str_mv | AT jieyang adaptiveinitializationmethodforkmeansalgorithm AT yukaiwang adaptiveinitializationmethodforkmeansalgorithm AT xinyao adaptiveinitializationmethodforkmeansalgorithm AT xinyao adaptiveinitializationmethodforkmeansalgorithm AT chintenglin adaptiveinitializationmethodforkmeansalgorithm |