Local-Sample-Weighted Clustering Ensemble with High-Order Graph Diffusion

The clustering ensemble method has attracted much attention because it can improve the stability and robustness of single clustering methods. Among them, similarity-matrix-based methods or graph-based methods have had a wide range of applications in recent years. Most similarity-matrix-based methods...

Full description

Bibliographic Details
Main Authors: Jianwen Gan, Yunhui Liang, Liang Du
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/11/6/1340
_version_ 1797610334460051456
author Jianwen Gan
Yunhui Liang
Liang Du
author_facet Jianwen Gan
Yunhui Liang
Liang Du
author_sort Jianwen Gan
collection DOAJ
description The clustering ensemble method has attracted much attention because it can improve the stability and robustness of single clustering methods. Among them, similarity-matrix-based methods or graph-based methods have had a wide range of applications in recent years. Most similarity-matrix-based methods calculate fully connected pairwise similarities by treating a base cluster as a whole and ignoring the importance of the relevance ranking of samples within the same base cluster. Since unreliable similarity estimates degrade clustering performance, constructing accurate similarity matrices is of great importance in applications. Higher-order graph diffusion based on reliable similarity matrices can further uncover potential connections between data. In this paper, we propose a more substantial graph-learning-based ensemble algorithm for local-sample-weighted clustering, which implicitly optimizes the adaptive weights of different neighborhoods based on the ranking importance of different neighbors. By further diffusion on the consensus matrix, we obtained an optimal consistency matrix with more substantial discriminative power, revealing the potential similarity relationship between samples. The experimental results showed that, compared with the second-best DREC algorithm, the accuracy of the proposed algorithm improved by 17.7%, and that of the normalized mutual information (NMI) algorithm improved by 15.88%. All empirical results showed that our clustering model consistently outperformed the related clustering methods.
first_indexed 2024-03-11T06:13:00Z
format Article
id doaj.art-107d1ef8edb24259b9d64724b8d5a925
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-11T06:13:00Z
publishDate 2023-03-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-107d1ef8edb24259b9d64724b8d5a9252023-11-17T12:27:13ZengMDPI AGMathematics2227-73902023-03-01116134010.3390/math11061340Local-Sample-Weighted Clustering Ensemble with High-Order Graph DiffusionJianwen Gan0Yunhui Liang1Liang Du2Institute of Big Data Science and Industry, Shanxi University, Taiyuan 030031, ChinaInstitute of Big Data Science and Industry, Shanxi University, Taiyuan 030031, ChinaInstitute of Big Data Science and Industry, Shanxi University, Taiyuan 030031, ChinaThe clustering ensemble method has attracted much attention because it can improve the stability and robustness of single clustering methods. Among them, similarity-matrix-based methods or graph-based methods have had a wide range of applications in recent years. Most similarity-matrix-based methods calculate fully connected pairwise similarities by treating a base cluster as a whole and ignoring the importance of the relevance ranking of samples within the same base cluster. Since unreliable similarity estimates degrade clustering performance, constructing accurate similarity matrices is of great importance in applications. Higher-order graph diffusion based on reliable similarity matrices can further uncover potential connections between data. In this paper, we propose a more substantial graph-learning-based ensemble algorithm for local-sample-weighted clustering, which implicitly optimizes the adaptive weights of different neighborhoods based on the ranking importance of different neighbors. By further diffusion on the consensus matrix, we obtained an optimal consistency matrix with more substantial discriminative power, revealing the potential similarity relationship between samples. The experimental results showed that, compared with the second-best DREC algorithm, the accuracy of the proposed algorithm improved by 17.7%, and that of the normalized mutual information (NMI) algorithm improved by 15.88%. All empirical results showed that our clustering model consistently outperformed the related clustering methods.https://www.mdpi.com/2227-7390/11/6/1340clustering ensemblelocalized weightedsparsegraph diffusion
spellingShingle Jianwen Gan
Yunhui Liang
Liang Du
Local-Sample-Weighted Clustering Ensemble with High-Order Graph Diffusion
Mathematics
clustering ensemble
localized weighted
sparse
graph diffusion
title Local-Sample-Weighted Clustering Ensemble with High-Order Graph Diffusion
title_full Local-Sample-Weighted Clustering Ensemble with High-Order Graph Diffusion
title_fullStr Local-Sample-Weighted Clustering Ensemble with High-Order Graph Diffusion
title_full_unstemmed Local-Sample-Weighted Clustering Ensemble with High-Order Graph Diffusion
title_short Local-Sample-Weighted Clustering Ensemble with High-Order Graph Diffusion
title_sort local sample weighted clustering ensemble with high order graph diffusion
topic clustering ensemble
localized weighted
sparse
graph diffusion
url https://www.mdpi.com/2227-7390/11/6/1340
work_keys_str_mv AT jianwengan localsampleweightedclusteringensemblewithhighordergraphdiffusion
AT yunhuiliang localsampleweightedclusteringensemblewithhighordergraphdiffusion
AT liangdu localsampleweightedclusteringensemblewithhighordergraphdiffusion