Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering

Constraint-based clustering utilizes pairwise constraints to improve clustering performance. In this paper, we propose a novel formulation algorithm to generate more informative pairwise constraints from limited queries for the constraint-based clustering. Our method consists of two phases: pre-clus...

Full description

Bibliographic Details
Main Authors: Guoxiang Zhong, Xiuqin Deng, Shengbing Xu
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8740960/
_version_ 1818479666614239232
author Guoxiang Zhong
Xiuqin Deng
Shengbing Xu
author_facet Guoxiang Zhong
Xiuqin Deng
Shengbing Xu
author_sort Guoxiang Zhong
collection DOAJ
description Constraint-based clustering utilizes pairwise constraints to improve clustering performance. In this paper, we propose a novel formulation algorithm to generate more informative pairwise constraints from limited queries for the constraint-based clustering. Our method consists of two phases: pre-clustering and marking. The pre-clustering phase introduces the fuzzy c-means clustering (FCM) to generate the cluster knowledge that is composed of the membership degree and the cluster centers. In the marking phase, we first propose the weak sample with the larger uncertainty expressed by the entropy of the membership degree. Then, we study the strong sample that contains less uncertainty and should be closest to its cluster center. Finally, given weak samples in descending order of entropy, we formulate informative pairs with strong samples and seek answers using the second minimal symmetric relative entropy priority principle, which leads to more efficient queries. Making use of the pairwise constraint k-means clustering (PCKM) as the underlying constraint-based clustering algorithm, further data experiments are conducted in several datasets to verify the improvement of our method.
first_indexed 2024-12-10T11:13:47Z
format Article
id doaj.art-9f41f0acbcc74b0a8ca0ae45b68bfd61
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-10T11:13:47Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-9f41f0acbcc74b0a8ca0ae45b68bfd612022-12-22T01:51:18ZengIEEEIEEE Access2169-35362019-01-017819838199310.1109/ACCESS.2019.29236598740960Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based ClusteringGuoxiang Zhong0https://orcid.org/0000-0002-3998-7282Xiuqin Deng1Shengbing Xu2School of Apply Mathematics, Guangdong University of Technology, Guangzhou, ChinaSchool of Apply Mathematics, Guangdong University of Technology, Guangzhou, ChinaSchool of Apply Mathematics, Guangdong University of Technology, Guangzhou, ChinaConstraint-based clustering utilizes pairwise constraints to improve clustering performance. In this paper, we propose a novel formulation algorithm to generate more informative pairwise constraints from limited queries for the constraint-based clustering. Our method consists of two phases: pre-clustering and marking. The pre-clustering phase introduces the fuzzy c-means clustering (FCM) to generate the cluster knowledge that is composed of the membership degree and the cluster centers. In the marking phase, we first propose the weak sample with the larger uncertainty expressed by the entropy of the membership degree. Then, we study the strong sample that contains less uncertainty and should be closest to its cluster center. Finally, given weak samples in descending order of entropy, we formulate informative pairs with strong samples and seek answers using the second minimal symmetric relative entropy priority principle, which leads to more efficient queries. Making use of the pairwise constraint k-means clustering (PCKM) as the underlying constraint-based clustering algorithm, further data experiments are conducted in several datasets to verify the improvement of our method.https://ieeexplore.ieee.org/document/8740960/Constraint-based clusteringpairwise constraintweak samplestrong samplesymmetric relative entropy
spellingShingle Guoxiang Zhong
Xiuqin Deng
Shengbing Xu
Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering
IEEE Access
Constraint-based clustering
pairwise constraint
weak sample
strong sample
symmetric relative entropy
title Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering
title_full Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering
title_fullStr Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering
title_full_unstemmed Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering
title_short Active Informative Pairwise Constraint Formulation Algorithm for Constraint-Based Clustering
title_sort active informative pairwise constraint formulation algorithm for constraint based clustering
topic Constraint-based clustering
pairwise constraint
weak sample
strong sample
symmetric relative entropy
url https://ieeexplore.ieee.org/document/8740960/
work_keys_str_mv AT guoxiangzhong activeinformativepairwiseconstraintformulationalgorithmforconstraintbasedclustering
AT xiuqindeng activeinformativepairwiseconstraintformulationalgorithmforconstraintbasedclustering
AT shengbingxu activeinformativepairwiseconstraintformulationalgorithmforconstraintbasedclustering