Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index

As a classical data mining technique,clustering is widely used in fields as pattern recognition,machine learning,artificial intelligence,and so on.By effective clustering analysis,the underlying structures of datasets can be identified.As a commonly used partitional clustering algorithm,K-means is s...

Full description

Bibliographic Details
Main Author: ZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou
Format: Article
Language:zho
Published: Editorial office of Computer Science 2022-01-01
Series:Jisuanji kexue
Subjects:
Online Access:https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-1-121.pdf
_version_ 1818995661519978496
author ZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou
author_facet ZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou
author_sort ZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou
collection DOAJ
description As a classical data mining technique,clustering is widely used in fields as pattern recognition,machine learning,artificial intelligence,and so on.By effective clustering analysis,the underlying structures of datasets can be identified.As a commonly used partitional clustering algorithm,K-means is simple of implementation and efficient on classifying large scale datasets.However,due to the influence of the convergence rule,the traditional K-means is still suffering problems as sensitive to the initial clustering centers,cannot properly process non-convex distributed datasets and datasets with outliers.This paper proposes the DC-Kmeans (density parameter and center replacement K-means),an improved K-means algorithm based on the density parameter and center replacement.Due to the gradually selecting of initial clustering centers and continuously update imprecision old centers,the DC-Kmeans is more accurate than the traditional K-means.Two novel methods are also proposed for optimally clustering:1)a novel clustering validity index (CVI),SCVI (Sum of the inner-cluster compactness and the inter-cluster separateness based CVI),is proposed to evaluate the results of the DC-Kmeans;2)a new algorithm,OCNS (optimal clustering number determination based on SCVI),is designed to determine the optimal clustering numbers for different datasets.Experimental results demonstrate that the proposed clustering method is effective for many kinds of datasets.
first_indexed 2024-12-20T21:17:24Z
format Article
id doaj.art-83bb5945f7934b90bcfd29e8c017d5c8
institution Directory Open Access Journal
issn 1002-137X
language zho
last_indexed 2024-12-20T21:17:24Z
publishDate 2022-01-01
publisher Editorial office of Computer Science
record_format Article
series Jisuanji kexue
spelling doaj.art-83bb5945f7934b90bcfd29e8c017d5c82022-12-21T19:26:22ZzhoEditorial office of Computer ScienceJisuanji kexue1002-137X2022-01-0149112113210.11896/jsjkx.201100148Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity IndexZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou0School of Computer Science and Technology,Anhui University,Hefei 230601,ChinaAs a classical data mining technique,clustering is widely used in fields as pattern recognition,machine learning,artificial intelligence,and so on.By effective clustering analysis,the underlying structures of datasets can be identified.As a commonly used partitional clustering algorithm,K-means is simple of implementation and efficient on classifying large scale datasets.However,due to the influence of the convergence rule,the traditional K-means is still suffering problems as sensitive to the initial clustering centers,cannot properly process non-convex distributed datasets and datasets with outliers.This paper proposes the DC-Kmeans (density parameter and center replacement K-means),an improved K-means algorithm based on the density parameter and center replacement.Due to the gradually selecting of initial clustering centers and continuously update imprecision old centers,the DC-Kmeans is more accurate than the traditional K-means.Two novel methods are also proposed for optimally clustering:1)a novel clustering validity index (CVI),SCVI (Sum of the inner-cluster compactness and the inter-cluster separateness based CVI),is proposed to evaluate the results of the DC-Kmeans;2)a new algorithm,OCNS (optimal clustering number determination based on SCVI),is designed to determine the optimal clustering numbers for different datasets.Experimental results demonstrate that the proposed clustering method is effective for many kinds of datasets.https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-1-121.pdfclustering algorithm|clustering validity index|optimal clustering number|cluster center|data mining
spellingShingle ZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou
Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index
Jisuanji kexue
clustering algorithm|clustering validity index|optimal clustering number|cluster center|data mining
title Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index
title_full Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index
title_fullStr Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index
title_full_unstemmed Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index
title_short Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index
title_sort study on density parameter and center replacement combined k means and new clustering validity index
topic clustering algorithm|clustering validity index|optimal clustering number|cluster center|data mining
url https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-1-121.pdf
work_keys_str_mv AT zhangyadisunyueliufengzhuerzhou studyondensityparameterandcenterreplacementcombinedkmeansandnewclusteringvalidityindex