ConDPC: Data Connectivity-Based Density Peak Clustering

As a relatively novel density-based clustering algorithm, Density peak clustering (DPC) has been widely studied in recent years. DPC sorts all points in descending order of local density and finds neighbors for each point in turn to assign all points to the appropriate clusters. The algorithm is sim...

Full description

Bibliographic Details
Main Authors: Yujuan Zou, Zhijian Wang
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/24/12812
_version_ 1827641953156595712
author Yujuan Zou
Zhijian Wang
author_facet Yujuan Zou
Zhijian Wang
author_sort Yujuan Zou
collection DOAJ
description As a relatively novel density-based clustering algorithm, Density peak clustering (DPC) has been widely studied in recent years. DPC sorts all points in descending order of local density and finds neighbors for each point in turn to assign all points to the appropriate clusters. The algorithm is simple and effective but has some limitations in applicable scenarios. If the density difference between clusters is large or the data distribution is in a nested structure, the clustering effect of this algorithm is poor. This study incorporates the idea of connectivity into the original algorithm and proposes an improved density peak clustering algorithm ConDPC. ConDPC modifies the strategy of obtaining clustering center points and assigning neighbors and improves the clustering accuracy of the original density peak clustering algorithm. In this study, clustering comparison experiments were conducted on synthetic data sets and real-world data sets. The compared algorithms include original DPC, DBSCAN, K-means and two improved algorithms over DPC. The comparison results prove the effectiveness of ConDPC.
first_indexed 2024-03-09T17:21:52Z
format Article
id doaj.art-354ca8b430ac4789b76c3e5c22b6d89a
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T17:21:52Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-354ca8b430ac4789b76c3e5c22b6d89a2023-11-24T13:05:11ZengMDPI AGApplied Sciences2076-34172022-12-0112241281210.3390/app122412812ConDPC: Data Connectivity-Based Density Peak ClusteringYujuan Zou0Zhijian Wang1College of Computer and Information, Hohai University, Nanjing 211100, ChinaCollege of Computer and Information, Hohai University, Nanjing 211100, ChinaAs a relatively novel density-based clustering algorithm, Density peak clustering (DPC) has been widely studied in recent years. DPC sorts all points in descending order of local density and finds neighbors for each point in turn to assign all points to the appropriate clusters. The algorithm is simple and effective but has some limitations in applicable scenarios. If the density difference between clusters is large or the data distribution is in a nested structure, the clustering effect of this algorithm is poor. This study incorporates the idea of connectivity into the original algorithm and proposes an improved density peak clustering algorithm ConDPC. ConDPC modifies the strategy of obtaining clustering center points and assigning neighbors and improves the clustering accuracy of the original density peak clustering algorithm. In this study, clustering comparison experiments were conducted on synthetic data sets and real-world data sets. The compared algorithms include original DPC, DBSCAN, K-means and two improved algorithms over DPC. The comparison results prove the effectiveness of ConDPC.https://www.mdpi.com/2076-3417/12/24/12812clusteringconnectivityEuclidean distanceneighbor distancedensity difference
spellingShingle Yujuan Zou
Zhijian Wang
ConDPC: Data Connectivity-Based Density Peak Clustering
Applied Sciences
clustering
connectivity
Euclidean distance
neighbor distance
density difference
title ConDPC: Data Connectivity-Based Density Peak Clustering
title_full ConDPC: Data Connectivity-Based Density Peak Clustering
title_fullStr ConDPC: Data Connectivity-Based Density Peak Clustering
title_full_unstemmed ConDPC: Data Connectivity-Based Density Peak Clustering
title_short ConDPC: Data Connectivity-Based Density Peak Clustering
title_sort condpc data connectivity based density peak clustering
topic clustering
connectivity
Euclidean distance
neighbor distance
density difference
url https://www.mdpi.com/2076-3417/12/24/12812
work_keys_str_mv AT yujuanzou condpcdataconnectivitybaseddensitypeakclustering
AT zhijianwang condpcdataconnectivitybaseddensitypeakclustering