Co-Clustering Ensemble Based on Bilateral K-Means Algorithm

Clustering ensemble technique has been shown to be effective in improving the accuracy and stability of single clustering algorithms. With the development of information technology, the amount of data, such as image, text and video, has increased rapidly. Efficiently clustering these large-scale dat...

Full description

Bibliographic Details
Main Authors: Hui Yang, Han Peng, Jianyong Zhu, Feiping Nie
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9032160/
Description
Summary:Clustering ensemble technique has been shown to be effective in improving the accuracy and stability of single clustering algorithms. With the development of information technology, the amount of data, such as image, text and video, has increased rapidly. Efficiently clustering these large-scale datasets is a challenge. Clustering ensembles usually transform clustering results to a co-association matrix, and then to a graph-partition problem. These methods may suffer from information loss when computing the similarity among samples or base clusterings. Rich information between samples and base clusterings is ignored. Moreover, the results are not discrete. They need post-processing steps to obtain the final clustering result, which will deviate greatly from the real clustering result. To address this problem, we propose a co-clustering ensemble based on bilateral k-means (CEBKM) algorithm. Our algorithm can simultaneously cluster samples and base clusterings of a dataset, to fully exploit the potential information between the samples and the base clusterings. In addition, it can directly obtain the final clustering results without using other clustering algorithms. The proposed method, outperformed several state-of-the-art clustering ensemble methods in experiments conducted on real-world and toy datasets.
ISSN:2169-3536