Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering
Constraint-based clustering utilizes pairwise constraints to improve clustering performance. In this paper, we propose a novel formulation algorithm to generate more informative pairwise constraints from limited queries for the constraint-based clustering. Our method consists of two phases: pre-clus...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8740960/ |
_version_ | 1818479666614239232 |
---|---|
author | Guoxiang Zhong Xiuqin Deng Shengbing Xu |
author_facet | Guoxiang Zhong Xiuqin Deng Shengbing Xu |
author_sort | Guoxiang Zhong |
collection | DOAJ |
description | Constraint-based clustering utilizes pairwise constraints to improve clustering performance. In this paper, we propose a novel formulation algorithm to generate more informative pairwise constraints from limited queries for the constraint-based clustering. Our method consists of two phases: pre-clustering and marking. The pre-clustering phase introduces the fuzzy c-means clustering (FCM) to generate the cluster knowledge that is composed of the membership degree and the cluster centers. In the marking phase, we first propose the weak sample with the larger uncertainty expressed by the entropy of the membership degree. Then, we study the strong sample that contains less uncertainty and should be closest to its cluster center. Finally, given weak samples in descending order of entropy, we formulate informative pairs with strong samples and seek answers using the second minimal symmetric relative entropy priority principle, which leads to more efficient queries. Making use of the pairwise constraint k-means clustering (PCKM) as the underlying constraint-based clustering algorithm, further data experiments are conducted in several datasets to verify the improvement of our method. |
first_indexed | 2024-12-10T11:13:47Z |
format | Article |
id | doaj.art-9f41f0acbcc74b0a8ca0ae45b68bfd61 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-10T11:13:47Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-9f41f0acbcc74b0a8ca0ae45b68bfd612022-12-22T01:51:18ZengIEEEIEEE Access2169-35362019-01-017819838199310.1109/ACCESS.2019.29236598740960Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based ClusteringGuoxiang Zhong0https://orcid.org/0000-0002-3998-7282Xiuqin Deng1Shengbing Xu2School of Apply Mathematics, Guangdong University of Technology, Guangzhou, ChinaSchool of Apply Mathematics, Guangdong University of Technology, Guangzhou, ChinaSchool of Apply Mathematics, Guangdong University of Technology, Guangzhou, ChinaConstraint-based clustering utilizes pairwise constraints to improve clustering performance. In this paper, we propose a novel formulation algorithm to generate more informative pairwise constraints from limited queries for the constraint-based clustering. Our method consists of two phases: pre-clustering and marking. The pre-clustering phase introduces the fuzzy c-means clustering (FCM) to generate the cluster knowledge that is composed of the membership degree and the cluster centers. In the marking phase, we first propose the weak sample with the larger uncertainty expressed by the entropy of the membership degree. Then, we study the strong sample that contains less uncertainty and should be closest to its cluster center. Finally, given weak samples in descending order of entropy, we formulate informative pairs with strong samples and seek answers using the second minimal symmetric relative entropy priority principle, which leads to more efficient queries. Making use of the pairwise constraint k-means clustering (PCKM) as the underlying constraint-based clustering algorithm, further data experiments are conducted in several datasets to verify the improvement of our method.https://ieeexplore.ieee.org/document/8740960/Constraint-based clusteringpairwise constraintweak samplestrong samplesymmetric relative entropy |
spellingShingle | Guoxiang Zhong Xiuqin Deng Shengbing Xu Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering IEEE Access Constraint-based clustering pairwise constraint weak sample strong sample symmetric relative entropy |
title | Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering |
title_full | Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering |
title_fullStr | Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering |
title_full_unstemmed | Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering |
title_short | Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering |
title_sort | active informative pairwise constraint formulation algorithm for constraint based clustering |
topic | Constraint-based clustering pairwise constraint weak sample strong sample symmetric relative entropy |
url | https://ieeexplore.ieee.org/document/8740960/ |
work_keys_str_mv | AT guoxiangzhong activeinformativepairwiseconstraintformulationalgorithmforconstraintbasedclustering AT xiuqindeng activeinformativepairwiseconstraintformulationalgorithmforconstraintbasedclustering AT shengbingxu activeinformativepairwiseconstraintformulationalgorithmforconstraintbasedclustering |