An Efficient Algorithm for Initializing Centroids in K-means Clustering

Clustering represents one of the most popular knowledge extraction algorithms in data mining techniques. Hierarchical and partitioning approaches are widely used in this field. Each has its own advantages, drawbacks and goals. K-means represents the most popular partitioning clusteringtechnique, how...

Full description

Bibliographic Details
Main Authors: Dr. Ahmed Hussain Aliwy, Dr. Kadhim B. S. Aljanabi
Format: Article
Language:English
Published: Faculty of Computer Science and Mathematics, University of Kufa 2016-12-01
Series:Journal of Kufa for Mathematics and Computer
Subjects:
Online Access:https://journal.uokufa.edu.iq/index.php/jkmc/article/view/2118
_version_ 1797267013640388608
author Dr. Ahmed Hussain Aliwy
Dr. Kadhim B. S. Aljanabi
author_facet Dr. Ahmed Hussain Aliwy
Dr. Kadhim B. S. Aljanabi
author_sort Dr. Ahmed Hussain Aliwy
collection DOAJ
description Clustering represents one of the most popular knowledge extraction algorithms in data mining techniques. Hierarchical and partitioning approaches are widely used in this field. Each has its own advantages, drawbacks and goals. K-means represents the most popular partitioning clusteringtechnique, however it suffers from two major drawbacks; time complexity and its sensitivity to the initial centroid values. The work in this paper presents an approach for estimating the starting initial centroids throughout three process including density based, normalization and smoothing ideas. The proposed algorithm has a strong mathematical foundation. The proposed approach was tested using a free standard data (20000 records). The results showed that the approach has better complexity and ensures the clustering convergence.
first_indexed 2024-04-25T01:09:50Z
format Article
id doaj.art-dfa1a8f31a5d438ea18f2c15ea9f1949
institution Directory Open Access Journal
issn 2076-1171
2518-0010
language English
last_indexed 2024-04-25T01:09:50Z
publishDate 2016-12-01
publisher Faculty of Computer Science and Mathematics, University of Kufa
record_format Article
series Journal of Kufa for Mathematics and Computer
spelling doaj.art-dfa1a8f31a5d438ea18f2c15ea9f19492024-03-10T10:37:32ZengFaculty of Computer Science and Mathematics, University of KufaJournal of Kufa for Mathematics and Computer2076-11712518-00102016-12-013210.31642/JoKMC/2018/030203An Efficient Algorithm for Initializing Centroids in K-means ClusteringDr. Ahmed Hussain Aliwy0Dr. Kadhim B. S. Aljanabi1University of KufaUniversity of KufaClustering represents one of the most popular knowledge extraction algorithms in data mining techniques. Hierarchical and partitioning approaches are widely used in this field. Each has its own advantages, drawbacks and goals. K-means represents the most popular partitioning clusteringtechnique, however it suffers from two major drawbacks; time complexity and its sensitivity to the initial centroid values. The work in this paper presents an approach for estimating the starting initial centroids throughout three process including density based, normalization and smoothing ideas. The proposed algorithm has a strong mathematical foundation. The proposed approach was tested using a free standard data (20000 records). The results showed that the approach has better complexity and ensures the clustering convergence.https://journal.uokufa.edu.iq/index.php/jkmc/article/view/2118Data MiningClusteringK-meansCentroids Complexity
spellingShingle Dr. Ahmed Hussain Aliwy
Dr. Kadhim B. S. Aljanabi
An Efficient Algorithm for Initializing Centroids in K-means Clustering
Journal of Kufa for Mathematics and Computer
Data Mining
Clustering
K-means
Centroids
Complexity
title An Efficient Algorithm for Initializing Centroids in K-means Clustering
title_full An Efficient Algorithm for Initializing Centroids in K-means Clustering
title_fullStr An Efficient Algorithm for Initializing Centroids in K-means Clustering
title_full_unstemmed An Efficient Algorithm for Initializing Centroids in K-means Clustering
title_short An Efficient Algorithm for Initializing Centroids in K-means Clustering
title_sort efficient algorithm for initializing centroids in k means clustering
topic Data Mining
Clustering
K-means
Centroids
Complexity
url https://journal.uokufa.edu.iq/index.php/jkmc/article/view/2118
work_keys_str_mv AT drahmedhussainaliwy anefficientalgorithmforinitializingcentroidsinkmeansclustering
AT drkadhimbsaljanabi anefficientalgorithmforinitializingcentroidsinkmeansclustering
AT drahmedhussainaliwy efficientalgorithmforinitializingcentroidsinkmeansclustering
AT drkadhimbsaljanabi efficientalgorithmforinitializingcentroidsinkmeansclustering