F1-ECAC: Enhanced Evolutionary Clustering Using an Ensemble of Supervised Classifiers

Clustering is an unsupervised learning technique used in data mining for finding groups with increased object similarity within but not between them. However, the absence of a-priori knowledge on the optimal clustering criterion, and the strong bias of traditional algorithms towards clusters with a...

Full description

Bibliographic Details
Main Authors: Benjamin M. Sainz-Tinajero, Andres E. Gutierrez-Rodriguez, Hector G. Ceballos, Francisco J. Cantu-Ortiz
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9551203/
_version_ 1818925499458519040
author Benjamin M. Sainz-Tinajero
Andres E. Gutierrez-Rodriguez
Hector G. Ceballos
Francisco J. Cantu-Ortiz
author_facet Benjamin M. Sainz-Tinajero
Andres E. Gutierrez-Rodriguez
Hector G. Ceballos
Francisco J. Cantu-Ortiz
author_sort Benjamin M. Sainz-Tinajero
collection DOAJ
description Clustering is an unsupervised learning technique used in data mining for finding groups with increased object similarity within but not between them. However, the absence of a-priori knowledge on the optimal clustering criterion, and the strong bias of traditional algorithms towards clusters with a specific shape, size, or density, raise the need for more flexible solutions to find the underlying structures of the data. As a solution, clustering has been modeled as an optimization problem using meta-heuristics for generating a search space to favor groups of any desired criterion. F1- ECAC is an evolutionary clustering algorithm with an objective function designed as a supervised learning problem, which evaluates the quality of a partition in terms of its generalization degree, or its capability to train an ensemble of classifiers. This algorithm is named after its previous version, ECAC (Evolutionary Clustering Algorithm Using Supervised Classifiers), considering its main point of difference, which is the inclusion of the F1-score instead of the Area Under the Curve metric in the objective function. F1- ECAC shows a significant increase in performance and efficiency to ECAC and is highly competitive to state-of-the-art clustering algorithms. The results demonstrate F1-ECAC’s benefits in usability in a wide variety of problems due to its innovative clustering criterion.
first_indexed 2024-12-20T02:42:12Z
format Article
id doaj.art-13a0f1c0f9214d4e88d36043f4dc3bb0
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-20T02:42:12Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-13a0f1c0f9214d4e88d36043f4dc3bb02022-12-21T19:56:16ZengIEEEIEEE Access2169-35362021-01-01913419213420710.1109/ACCESS.2021.31160929551203F1-ECAC: Enhanced Evolutionary Clustering Using an Ensemble of Supervised ClassifiersBenjamin M. Sainz-Tinajero0https://orcid.org/0000-0002-1614-5066Andres E. Gutierrez-Rodriguez1Hector G. Ceballos2https://orcid.org/0000-0002-2460-3442Francisco J. Cantu-Ortiz3Tecnologico de Monterrey, School of Engineering and Science, Atizapan de Zaragoza, Estado de Mexico, MexicoTecnologico de Monterrey, School of Engineering and Science, Toluca de Lerdo, Estado de Mexico, MexicoTecnologico de Monterrey, School of Engineering and Science, Monterrey, Nuevo Leon, MexicoTecnologico de Monterrey, School of Engineering and Science, Monterrey, Nuevo Leon, MexicoClustering is an unsupervised learning technique used in data mining for finding groups with increased object similarity within but not between them. However, the absence of a-priori knowledge on the optimal clustering criterion, and the strong bias of traditional algorithms towards clusters with a specific shape, size, or density, raise the need for more flexible solutions to find the underlying structures of the data. As a solution, clustering has been modeled as an optimization problem using meta-heuristics for generating a search space to favor groups of any desired criterion. F1- ECAC is an evolutionary clustering algorithm with an objective function designed as a supervised learning problem, which evaluates the quality of a partition in terms of its generalization degree, or its capability to train an ensemble of classifiers. This algorithm is named after its previous version, ECAC (Evolutionary Clustering Algorithm Using Supervised Classifiers), considering its main point of difference, which is the inclusion of the F1-score instead of the Area Under the Curve metric in the objective function. F1- ECAC shows a significant increase in performance and efficiency to ECAC and is highly competitive to state-of-the-art clustering algorithms. The results demonstrate F1-ECAC’s benefits in usability in a wide variety of problems due to its innovative clustering criterion.https://ieeexplore.ieee.org/document/9551203/Unsupervised learningclusteringevolutionary clusteringoptimizationclassifier ensembles
spellingShingle Benjamin M. Sainz-Tinajero
Andres E. Gutierrez-Rodriguez
Hector G. Ceballos
Francisco J. Cantu-Ortiz
F1-ECAC: Enhanced Evolutionary Clustering Using an Ensemble of Supervised Classifiers
IEEE Access
Unsupervised learning
clustering
evolutionary clustering
optimization
classifier ensembles
title F1-ECAC: Enhanced Evolutionary Clustering Using an Ensemble of Supervised Classifiers
title_full F1-ECAC: Enhanced Evolutionary Clustering Using an Ensemble of Supervised Classifiers
title_fullStr F1-ECAC: Enhanced Evolutionary Clustering Using an Ensemble of Supervised Classifiers
title_full_unstemmed F1-ECAC: Enhanced Evolutionary Clustering Using an Ensemble of Supervised Classifiers
title_short F1-ECAC: Enhanced Evolutionary Clustering Using an Ensemble of Supervised Classifiers
title_sort f1 ecac enhanced evolutionary clustering using an ensemble of supervised classifiers
topic Unsupervised learning
clustering
evolutionary clustering
optimization
classifier ensembles
url https://ieeexplore.ieee.org/document/9551203/
work_keys_str_mv AT benjaminmsainztinajero f1ecacenhancedevolutionaryclusteringusinganensembleofsupervisedclassifiers
AT andresegutierrezrodriguez f1ecacenhancedevolutionaryclusteringusinganensembleofsupervisedclassifiers
AT hectorgceballos f1ecacenhancedevolutionaryclusteringusinganensembleofsupervisedclassifiers
AT franciscojcantuortiz f1ecacenhancedevolutionaryclusteringusinganensembleofsupervisedclassifiers