Fast clustering algorithm based on MST of representative points

Minimum spanning tree (MST)-based clustering algorithms are widely used to detect clusters with diverse densities and irregular shapes. However, most algorithms require the entire dataset to construct an MST, which leads to significant computational overhead. To alleviate this issue, our proposed al...

Full description

Bibliographic Details
Main Authors: Hui Du, Depeng Lu, Zhihe Wang, Cuntao Ma, Xinxin Shi, Xiaoli Wang
Format: Article
Language:English
Published: AIMS Press 2023-07-01
Series:Mathematical Biosciences and Engineering
Subjects:
Online Access:https://www.aimspress.com/article/doi/10.3934/mbe.2023705?viewType=HTML
_version_ 1827823149232685056
author Hui Du
Depeng Lu
Zhihe Wang
Cuntao Ma
Xinxin Shi
Xiaoli Wang
author_facet Hui Du
Depeng Lu
Zhihe Wang
Cuntao Ma
Xinxin Shi
Xiaoli Wang
author_sort Hui Du
collection DOAJ
description Minimum spanning tree (MST)-based clustering algorithms are widely used to detect clusters with diverse densities and irregular shapes. However, most algorithms require the entire dataset to construct an MST, which leads to significant computational overhead. To alleviate this issue, our proposed algorithm R-MST utilizes representative points instead of all sample points for constructing MST. Additionally, based on the density and nearest neighbor distance, we improved the representative point selection strategy to enhance the uniform distribution of representative points in sparse areas, enabling the algorithm to perform well on datasets with varying densities. Furthermore, traditional methods for eliminating inconsistent edges generally require prior knowledge about the number of clusters, which is not always readily available in practical applications. Therefore, we propose an adaptive method that employs mutual neighbors to identify inconsistent edges and determine the optimal number of clusters automatically. The experimental results indicate that the R-MST algorithm not only improves the efficiency of clustering but also enhances its accuracy.
first_indexed 2024-03-12T02:08:03Z
format Article
id doaj.art-412cc9b7803b4ce395e2eab2d10a1773
institution Directory Open Access Journal
issn 1551-0018
language English
last_indexed 2024-03-12T02:08:03Z
publishDate 2023-07-01
publisher AIMS Press
record_format Article
series Mathematical Biosciences and Engineering
spelling doaj.art-412cc9b7803b4ce395e2eab2d10a17732023-09-07T01:02:05ZengAIMS PressMathematical Biosciences and Engineering1551-00182023-07-01209158301585810.3934/mbe.2023705Fast clustering algorithm based on MST of representative pointsHui Du0Depeng Lu1Zhihe Wang 2Cuntao Ma 3Xinxin Shi4Xiaoli Wang51. The School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China1. The School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China1. The School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China1. The School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China1. The School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China2. School of Computer Science and Technology, Xidian University, Xi'an 710071, ChinaMinimum spanning tree (MST)-based clustering algorithms are widely used to detect clusters with diverse densities and irregular shapes. However, most algorithms require the entire dataset to construct an MST, which leads to significant computational overhead. To alleviate this issue, our proposed algorithm R-MST utilizes representative points instead of all sample points for constructing MST. Additionally, based on the density and nearest neighbor distance, we improved the representative point selection strategy to enhance the uniform distribution of representative points in sparse areas, enabling the algorithm to perform well on datasets with varying densities. Furthermore, traditional methods for eliminating inconsistent edges generally require prior knowledge about the number of clusters, which is not always readily available in practical applications. Therefore, we propose an adaptive method that employs mutual neighbors to identify inconsistent edges and determine the optimal number of clusters automatically. The experimental results indicate that the R-MST algorithm not only improves the efficiency of clustering but also enhances its accuracy.https://www.aimspress.com/article/doi/10.3934/mbe.2023705?viewType=HTMLminimum spanning treeinconsistent edgesmutual neighborsclusteringdensity
spellingShingle Hui Du
Depeng Lu
Zhihe Wang
Cuntao Ma
Xinxin Shi
Xiaoli Wang
Fast clustering algorithm based on MST of representative points
Mathematical Biosciences and Engineering
minimum spanning tree
inconsistent edges
mutual neighbors
clustering
density
title Fast clustering algorithm based on MST of representative points
title_full Fast clustering algorithm based on MST of representative points
title_fullStr Fast clustering algorithm based on MST of representative points
title_full_unstemmed Fast clustering algorithm based on MST of representative points
title_short Fast clustering algorithm based on MST of representative points
title_sort fast clustering algorithm based on mst of representative points
topic minimum spanning tree
inconsistent edges
mutual neighbors
clustering
density
url https://www.aimspress.com/article/doi/10.3934/mbe.2023705?viewType=HTML
work_keys_str_mv AT huidu fastclusteringalgorithmbasedonmstofrepresentativepoints
AT depenglu fastclusteringalgorithmbasedonmstofrepresentativepoints
AT zhihewang fastclusteringalgorithmbasedonmstofrepresentativepoints
AT cuntaoma fastclusteringalgorithmbasedonmstofrepresentativepoints
AT xinxinshi fastclusteringalgorithmbasedonmstofrepresentativepoints
AT xiaoliwang fastclusteringalgorithmbasedonmstofrepresentativepoints