Adaptive Initialization Method for K-Means Algorithm

The K-means algorithm is a widely used clustering algorithm that offers simplicity and efficiency. However, the traditional K-means algorithm uses a random method to determine the initial cluster centers, which make clustering results prone to local optima and then result in worse clustering perform...

Full description

Bibliographic Details
Main Authors: Jie Yang, Yu-Kai Wang, Xin Yao, Chin-Teng Lin
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-11-01
Series:Frontiers in Artificial Intelligence
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frai.2021.740817/full
_version_ 1831569642204692480
author Jie Yang
Yu-Kai Wang
Xin Yao
Xin Yao
Chin-Teng Lin
author_facet Jie Yang
Yu-Kai Wang
Xin Yao
Xin Yao
Chin-Teng Lin
author_sort Jie Yang
collection DOAJ
description The K-means algorithm is a widely used clustering algorithm that offers simplicity and efficiency. However, the traditional K-means algorithm uses a random method to determine the initial cluster centers, which make clustering results prone to local optima and then result in worse clustering performance. In this research, we propose an adaptive initialization method for the K-means algorithm (AIMK) which can adapt to the various characteristics in different datasets and obtain better clustering performance with stable results. For larger or higher-dimensional datasets, we even leverage random sampling in AIMK (name as AIMK-RS) to reduce the time complexity. 22 real-world datasets were applied for performance comparisons. The experimental results show AIMK and AIMK-RS outperform the current initialization methods and several well-known clustering algorithms. Specifically, AIMK-RS can significantly reduce the time complexity to O (n). Moreover, we exploit AIMK to initialize K-medoids and spectral clustering, and better performance is also explored. The above results demonstrate superior performance and good scalability by AIMK or AIMK-RS. In the future, we would like to apply AIMK to more partition-based clustering algorithms to solve real-life practical problems.
first_indexed 2024-12-17T12:37:01Z
format Article
id doaj.art-6bbbbdc684c34de5be854a7debfdb9da
institution Directory Open Access Journal
issn 2624-8212
language English
last_indexed 2024-12-17T12:37:01Z
publishDate 2021-11-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Artificial Intelligence
spelling doaj.art-6bbbbdc684c34de5be854a7debfdb9da2022-12-21T21:48:13ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122021-11-01410.3389/frai.2021.740817740817Adaptive Initialization Method for K-Means AlgorithmJie Yang0Yu-Kai Wang1Xin Yao2Xin Yao3Chin-Teng Lin4Computational Intelligence and Brain Computer Interface Lab, Australian Artificial Intelligence Institute, FEIT, University of Technology Sydney, Sydney, NSW, AustraliaComputational Intelligence and Brain Computer Interface Lab, Australian Artificial Intelligence Institute, FEIT, University of Technology Sydney, Sydney, NSW, AustraliaShenzhen Key Laboratory of Computational Intelligence, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, ChinaCERCIA, School of Computer Science, University of Birmingham, Birmingham, United KingdomComputational Intelligence and Brain Computer Interface Lab, Australian Artificial Intelligence Institute, FEIT, University of Technology Sydney, Sydney, NSW, AustraliaThe K-means algorithm is a widely used clustering algorithm that offers simplicity and efficiency. However, the traditional K-means algorithm uses a random method to determine the initial cluster centers, which make clustering results prone to local optima and then result in worse clustering performance. In this research, we propose an adaptive initialization method for the K-means algorithm (AIMK) which can adapt to the various characteristics in different datasets and obtain better clustering performance with stable results. For larger or higher-dimensional datasets, we even leverage random sampling in AIMK (name as AIMK-RS) to reduce the time complexity. 22 real-world datasets were applied for performance comparisons. The experimental results show AIMK and AIMK-RS outperform the current initialization methods and several well-known clustering algorithms. Specifically, AIMK-RS can significantly reduce the time complexity to O (n). Moreover, we exploit AIMK to initialize K-medoids and spectral clustering, and better performance is also explored. The above results demonstrate superior performance and good scalability by AIMK or AIMK-RS. In the future, we would like to apply AIMK to more partition-based clustering algorithms to solve real-life practical problems.https://www.frontiersin.org/articles/10.3389/frai.2021.740817/fullk-meansadaptiveinitialization methodinitial cluster centersclustering
spellingShingle Jie Yang
Yu-Kai Wang
Xin Yao
Xin Yao
Chin-Teng Lin
Adaptive Initialization Method for K-Means Algorithm
Frontiers in Artificial Intelligence
k-means
adaptive
initialization method
initial cluster centers
clustering
title Adaptive Initialization Method for K-Means Algorithm
title_full Adaptive Initialization Method for K-Means Algorithm
title_fullStr Adaptive Initialization Method for K-Means Algorithm
title_full_unstemmed Adaptive Initialization Method for K-Means Algorithm
title_short Adaptive Initialization Method for K-Means Algorithm
title_sort adaptive initialization method for k means algorithm
topic k-means
adaptive
initialization method
initial cluster centers
clustering
url https://www.frontiersin.org/articles/10.3389/frai.2021.740817/full
work_keys_str_mv AT jieyang adaptiveinitializationmethodforkmeansalgorithm
AT yukaiwang adaptiveinitializationmethodforkmeansalgorithm
AT xinyao adaptiveinitializationmethodforkmeansalgorithm
AT xinyao adaptiveinitializationmethodforkmeansalgorithm
AT chintenglin adaptiveinitializationmethodforkmeansalgorithm