Hierarchical cluster ensemble selection

Clustering ensemble performance is affected by two main factors: diversity and quality. Selection of a subset of available ensemble members based on diversity and quality often leads to a more accurate ensemble solution. However, there is not a certain relationship between diversity and quality in s...

Full description

Bibliographic Details
Main Authors: Akbari, Ebrahim, Mohamed Dahlan, Halina, Ibrahim, Roliana, Alizadeh, Hosein
Format: Article
Published: Elsevier 2015
Subjects:
Description
Summary:Clustering ensemble performance is affected by two main factors: diversity and quality. Selection of a subset of available ensemble members based on diversity and quality often leads to a more accurate ensemble solution. However, there is not a certain relationship between diversity and quality in selection of subset of ensemble members. This paper proposes the Hierarchical Cluster Ensemble Selection (HCES) method and diversity measure to explore how diversity and quality affect final results. The HCES uses single-link, average-link, and complete link agglomerative clustering methods for the selection of ensemble members hierarchically. A pair-wise diversity measure from the recent literature and the proposed diversity measure are applied to these agglomerative clustering algorithms. Using the proposed diversity measure in HCES leads to more diverse ensemble members than that of pairwise diversity measure. Cluster-based Similarity Partition Algorithm (CSPA) and Hypergraph-Partitioning Algorithm (HGPA) were employed in HCES method for obtaining the full ensemble and cluster ensemble selection solution. To evaluate the performance of the HCES method, several experiments were conducted on several real data sets and the obtained results were compared to those of full ensembles. The results showed that the HCES method led to a more significant performance improvement compared with full ensembles