Study on the Influence of Diversity and Quality in Entropy Based Collaborative Clustering

The aim of collaborative clustering is to enhance the performances of clustering algorithms by enabling them to work together and exchange their information to tackle difficult data sets. The fundamental concept of collaboration is that clustering algorithms operate locally but collaborate by exchan...

Full description

Bibliographic Details
Main Authors: Jérémie Sublime, Guénaël Cabanes, Basarab Matei
Format: Article
Language:English
Published: MDPI AG 2019-09-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/21/10/951
_version_ 1818039285038710784
author Jérémie Sublime
Guénaël Cabanes
Basarab Matei
author_facet Jérémie Sublime
Guénaël Cabanes
Basarab Matei
author_sort Jérémie Sublime
collection DOAJ
description The aim of collaborative clustering is to enhance the performances of clustering algorithms by enabling them to work together and exchange their information to tackle difficult data sets. The fundamental concept of collaboration is that clustering algorithms operate locally but collaborate by exchanging information about the local structures found by each algorithm. This kind of collaborative learning can be beneficial to a wide number of tasks including multi-view clustering, clustering of distributed data with privacy constraints, multi-expert clustering and multi-scale analysis. Within this context, the main difficulty of collaborative clustering is to determine how to weight the influence of the different clustering methods with the goal of maximizing the final results and minimizing the risk of negative collaborations—where the results are worse after collaboration than before. In this paper, we study how the quality and diversity of the different collaborators, but also the stability of the partitions can influence the final results. We propose both a theoretical analysis based on mathematical optimization, and a second study based on empirical results. Our findings show that on the one hand, in the absence of a clear criterion to optimize, a low diversity pool of solution with a high stability are the best option to ensure good performances. And on the other hand, if there is a known criterion to maximize, it is best to rely on a higher diversity pool of solution with a high quality on the said criterion. While our approach focuses on entropy based collaborative clustering, we believe that most of our results could be extended to other collaborative algorithms.
first_indexed 2024-12-10T07:56:12Z
format Article
id doaj.art-815adba183364ba593b6fc857b1678a6
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-12-10T07:56:12Z
publishDate 2019-09-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-815adba183364ba593b6fc857b1678a62022-12-22T01:56:53ZengMDPI AGEntropy1099-43002019-09-01211095110.3390/e21100951e21100951Study on the Influence of Diversity and Quality in Entropy Based Collaborative ClusteringJérémie Sublime0Guénaël Cabanes1Basarab Matei2ISEP, DaSSIP Team–LISITE, 10 rue de Vanves, 92130 Issy-Les-Moulineaux, FranceUniversity Paris 13, Sorbonne Paris Cité, LIPN-CNRS UMR 7030, 99 av. J-B Clément, 93430 Villetaneuse, FranceUniversity Paris 13, Sorbonne Paris Cité, LIPN-CNRS UMR 7030, 99 av. J-B Clément, 93430 Villetaneuse, FranceThe aim of collaborative clustering is to enhance the performances of clustering algorithms by enabling them to work together and exchange their information to tackle difficult data sets. The fundamental concept of collaboration is that clustering algorithms operate locally but collaborate by exchanging information about the local structures found by each algorithm. This kind of collaborative learning can be beneficial to a wide number of tasks including multi-view clustering, clustering of distributed data with privacy constraints, multi-expert clustering and multi-scale analysis. Within this context, the main difficulty of collaborative clustering is to determine how to weight the influence of the different clustering methods with the goal of maximizing the final results and minimizing the risk of negative collaborations—where the results are worse after collaboration than before. In this paper, we study how the quality and diversity of the different collaborators, but also the stability of the partitions can influence the final results. We propose both a theoretical analysis based on mathematical optimization, and a second study based on empirical results. Our findings show that on the one hand, in the absence of a clear criterion to optimize, a low diversity pool of solution with a high stability are the best option to ensure good performances. And on the other hand, if there is a known criterion to maximize, it is best to rely on a higher diversity pool of solution with a high quality on the said criterion. While our approach focuses on entropy based collaborative clustering, we believe that most of our results could be extended to other collaborative algorithms.https://www.mdpi.com/1099-4300/21/10/951collaborative clusteringclustering qualityentropydiversity
spellingShingle Jérémie Sublime
Guénaël Cabanes
Basarab Matei
Study on the Influence of Diversity and Quality in Entropy Based Collaborative Clustering
Entropy
collaborative clustering
clustering quality
entropy
diversity
title Study on the Influence of Diversity and Quality in Entropy Based Collaborative Clustering
title_full Study on the Influence of Diversity and Quality in Entropy Based Collaborative Clustering
title_fullStr Study on the Influence of Diversity and Quality in Entropy Based Collaborative Clustering
title_full_unstemmed Study on the Influence of Diversity and Quality in Entropy Based Collaborative Clustering
title_short Study on the Influence of Diversity and Quality in Entropy Based Collaborative Clustering
title_sort study on the influence of diversity and quality in entropy based collaborative clustering
topic collaborative clustering
clustering quality
entropy
diversity
url https://www.mdpi.com/1099-4300/21/10/951
work_keys_str_mv AT jeremiesublime studyontheinfluenceofdiversityandqualityinentropybasedcollaborativeclustering
AT guenaelcabanes studyontheinfluenceofdiversityandqualityinentropybasedcollaborativeclustering
AT basarabmatei studyontheinfluenceofdiversityandqualityinentropybasedcollaborativeclustering