Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis

Fuzzy clustering has been broadly applied to classify data into <i>K</i> clusters by assigning membership probabilities of each data point close to <i>K</i> centroids. Such a function has been applied into characterizing the clusters associated with a statistical model such a...

Full description

Bibliographic Details
Main Authors:	Ji Hoon Ryoo, Seohee Park, Seongeun Kim, Hyun Suk Ryoo
Format:	Article
Language:	English
Published:	MDPI AG 2020-09-01
Series:	Symmetry
Subjects:	cluster validity problem FIT-FHV method fuzzy clustering fuzzy hypervolume validity index generalized structured component analysis structural equation modeling
Online Access:	https://www.mdpi.com/2073-8994/12/9/1514

_version_	1797553734621855744
author	Ji Hoon Ryoo Seohee Park Seongeun Kim Hyun Suk Ryoo
author_facet	Ji Hoon Ryoo Seohee Park Seongeun Kim Hyun Suk Ryoo
author_sort	Ji Hoon Ryoo
collection	DOAJ
description	Fuzzy clustering has been broadly applied to classify data into <i>K</i> clusters by assigning membership probabilities of each data point close to <i>K</i> centroids. Such a function has been applied into characterizing the clusters associated with a statistical model such as structural equation modeling. The characteristics identified by the statistical model further define the clusters as heterogeneous groups selected from a population. Recently, such statistical model has been formulated as fuzzy clusterwise generalized structured component analysis (fuzzy clusterwise GSCA). The same as in fuzzy clustering, the clusters are enumerated to infer the population and its parameters within the fuzzy clusterwise GSCA. However, the identification of clusters in fuzzy clustering is a difficult task because of the data-dependence of classification indexes, which is known as a cluster validity problem. We examined the cluster validity problem within the fuzzy clusterwise GSCA framework and proposed a new criterion for selecting the most optimal number of clusters using both fit indexes of the GSCA and the fuzzy validity indexes in fuzzy clustering. The criterion, named the FIT-FHV method combining a fit index, FIT, from GSCA and a cluster validation measure, FHV, from fuzzy clustering, performed better than any other indices used in fuzzy clusterwise GSCA.
first_indexed	2024-03-10T16:20:48Z
format	Article
id	doaj.art-a8efedea7d844d02a59a05123564bdde
institution	Directory Open Access Journal
issn	2073-8994
language	English
last_indexed	2024-03-10T16:20:48Z
publishDate	2020-09-01
publisher	MDPI AG
record_format	Article
series	Symmetry
spelling	doaj.art-a8efedea7d844d02a59a05123564bdde2023-11-20T13:42:24ZengMDPI AGSymmetry2073-89942020-09-01129151410.3390/sym12091514Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component AnalysisJi Hoon Ryoo0Seohee Park1Seongeun Kim2Hyun Suk Ryoo3Department of Education, College of Educational Sciences, Yonsei University, Seoul 03722, KoreaDepartment of Educational Measurement and Statistics, College of Education, University of Iowa, Iowa, IA 52242, USADepartment of Educational Research Methodology, School of Education, University of North Carolina at Greensboro, Greensboro, NC 27412, USADepartment of Computer Science, College of Arts and Science, University of Virginia, Charlottesville, VA 22904, USAFuzzy clustering has been broadly applied to classify data into <i>K</i> clusters by assigning membership probabilities of each data point close to <i>K</i> centroids. Such a function has been applied into characterizing the clusters associated with a statistical model such as structural equation modeling. The characteristics identified by the statistical model further define the clusters as heterogeneous groups selected from a population. Recently, such statistical model has been formulated as fuzzy clusterwise generalized structured component analysis (fuzzy clusterwise GSCA). The same as in fuzzy clustering, the clusters are enumerated to infer the population and its parameters within the fuzzy clusterwise GSCA. However, the identification of clusters in fuzzy clustering is a difficult task because of the data-dependence of classification indexes, which is known as a cluster validity problem. We examined the cluster validity problem within the fuzzy clusterwise GSCA framework and proposed a new criterion for selecting the most optimal number of clusters using both fit indexes of the GSCA and the fuzzy validity indexes in fuzzy clustering. The criterion, named the FIT-FHV method combining a fit index, FIT, from GSCA and a cluster validation measure, FHV, from fuzzy clustering, performed better than any other indices used in fuzzy clusterwise GSCA.https://www.mdpi.com/2073-8994/12/9/1514cluster validity problemFIT-FHV methodfuzzy clusteringfuzzy hypervolume validity indexgeneralized structured component analysisstructural equation modeling
spellingShingle	Ji Hoon Ryoo Seohee Park Seongeun Kim Hyun Suk Ryoo Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis Symmetry cluster validity problem FIT-FHV method fuzzy clustering fuzzy hypervolume validity index generalized structured component analysis structural equation modeling
title	Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis
title_full	Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis
title_fullStr	Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis
title_full_unstemmed	Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis
title_short	Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis
title_sort	efficiency of cluster validity indexes in fuzzy clusterwise generalized structured component analysis
topic	cluster validity problem FIT-FHV method fuzzy clustering fuzzy hypervolume validity index generalized structured component analysis structural equation modeling
url	https://www.mdpi.com/2073-8994/12/9/1514
work_keys_str_mv	AT jihoonryoo efficiencyofclustervalidityindexesinfuzzyclusterwisegeneralizedstructuredcomponentanalysis AT seoheepark efficiencyofclustervalidityindexesinfuzzyclusterwisegeneralizedstructuredcomponentanalysis AT seongeunkim efficiencyofclustervalidityindexesinfuzzyclusterwisegeneralizedstructuredcomponentanalysis AT hyunsukryoo efficiencyofclustervalidityindexesinfuzzyclusterwisegeneralizedstructuredcomponentanalysis

Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis

Similar Items