Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis
Fuzzy clustering has been broadly applied to classify data into <i>K</i> clusters by assigning membership probabilities of each data point close to <i>K</i> centroids. Such a function has been applied into characterizing the clusters associated with a statistical model such a...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-09-01
|
Series: | Symmetry |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-8994/12/9/1514 |
_version_ | 1797553734621855744 |
---|---|
author | Ji Hoon Ryoo Seohee Park Seongeun Kim Hyun Suk Ryoo |
author_facet | Ji Hoon Ryoo Seohee Park Seongeun Kim Hyun Suk Ryoo |
author_sort | Ji Hoon Ryoo |
collection | DOAJ |
description | Fuzzy clustering has been broadly applied to classify data into <i>K</i> clusters by assigning membership probabilities of each data point close to <i>K</i> centroids. Such a function has been applied into characterizing the clusters associated with a statistical model such as structural equation modeling. The characteristics identified by the statistical model further define the clusters as heterogeneous groups selected from a population. Recently, such statistical model has been formulated as fuzzy clusterwise generalized structured component analysis (fuzzy clusterwise GSCA). The same as in fuzzy clustering, the clusters are enumerated to infer the population and its parameters within the fuzzy clusterwise GSCA. However, the identification of clusters in fuzzy clustering is a difficult task because of the data-dependence of classification indexes, which is known as a cluster validity problem. We examined the cluster validity problem within the fuzzy clusterwise GSCA framework and proposed a new criterion for selecting the most optimal number of clusters using both fit indexes of the GSCA and the fuzzy validity indexes in fuzzy clustering. The criterion, named the FIT-FHV method combining a fit index, FIT, from GSCA and a cluster validation measure, FHV, from fuzzy clustering, performed better than any other indices used in fuzzy clusterwise GSCA. |
first_indexed | 2024-03-10T16:20:48Z |
format | Article |
id | doaj.art-a8efedea7d844d02a59a05123564bdde |
institution | Directory Open Access Journal |
issn | 2073-8994 |
language | English |
last_indexed | 2024-03-10T16:20:48Z |
publishDate | 2020-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Symmetry |
spelling | doaj.art-a8efedea7d844d02a59a05123564bdde2023-11-20T13:42:24ZengMDPI AGSymmetry2073-89942020-09-01129151410.3390/sym12091514Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component AnalysisJi Hoon Ryoo0Seohee Park1Seongeun Kim2Hyun Suk Ryoo3Department of Education, College of Educational Sciences, Yonsei University, Seoul 03722, KoreaDepartment of Educational Measurement and Statistics, College of Education, University of Iowa, Iowa, IA 52242, USADepartment of Educational Research Methodology, School of Education, University of North Carolina at Greensboro, Greensboro, NC 27412, USADepartment of Computer Science, College of Arts and Science, University of Virginia, Charlottesville, VA 22904, USAFuzzy clustering has been broadly applied to classify data into <i>K</i> clusters by assigning membership probabilities of each data point close to <i>K</i> centroids. Such a function has been applied into characterizing the clusters associated with a statistical model such as structural equation modeling. The characteristics identified by the statistical model further define the clusters as heterogeneous groups selected from a population. Recently, such statistical model has been formulated as fuzzy clusterwise generalized structured component analysis (fuzzy clusterwise GSCA). The same as in fuzzy clustering, the clusters are enumerated to infer the population and its parameters within the fuzzy clusterwise GSCA. However, the identification of clusters in fuzzy clustering is a difficult task because of the data-dependence of classification indexes, which is known as a cluster validity problem. We examined the cluster validity problem within the fuzzy clusterwise GSCA framework and proposed a new criterion for selecting the most optimal number of clusters using both fit indexes of the GSCA and the fuzzy validity indexes in fuzzy clustering. The criterion, named the FIT-FHV method combining a fit index, FIT, from GSCA and a cluster validation measure, FHV, from fuzzy clustering, performed better than any other indices used in fuzzy clusterwise GSCA.https://www.mdpi.com/2073-8994/12/9/1514cluster validity problemFIT-FHV methodfuzzy clusteringfuzzy hypervolume validity indexgeneralized structured component analysisstructural equation modeling |
spellingShingle | Ji Hoon Ryoo Seohee Park Seongeun Kim Hyun Suk Ryoo Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis Symmetry cluster validity problem FIT-FHV method fuzzy clustering fuzzy hypervolume validity index generalized structured component analysis structural equation modeling |
title | Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis |
title_full | Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis |
title_fullStr | Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis |
title_full_unstemmed | Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis |
title_short | Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis |
title_sort | efficiency of cluster validity indexes in fuzzy clusterwise generalized structured component analysis |
topic | cluster validity problem FIT-FHV method fuzzy clustering fuzzy hypervolume validity index generalized structured component analysis structural equation modeling |
url | https://www.mdpi.com/2073-8994/12/9/1514 |
work_keys_str_mv | AT jihoonryoo efficiencyofclustervalidityindexesinfuzzyclusterwisegeneralizedstructuredcomponentanalysis AT seoheepark efficiencyofclustervalidityindexesinfuzzyclusterwisegeneralizedstructuredcomponentanalysis AT seongeunkim efficiencyofclustervalidityindexesinfuzzyclusterwisegeneralizedstructuredcomponentanalysis AT hyunsukryoo efficiencyofclustervalidityindexesinfuzzyclusterwisegeneralizedstructuredcomponentanalysis |