Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks

Clustering is a widely used unsupervised learning technique across data mining and machine learning applications and finds frequent use in diverse fields ranging from astronomy, medical imaging, search and optimization, geology, geophysics, and sentiment analysis, to name a few. It is therefore impo...

Full description

Bibliographic Details
Main Authors: Adam Pickens, Saptarshi Sengupta
Format: Article
Language:English
Published: MDPI AG 2021-08-01
Series:Machine Learning and Knowledge Extraction
Subjects:
Online Access:https://www.mdpi.com/2504-4990/3/3/35
_version_ 1797518423776821248
author Adam Pickens
Saptarshi Sengupta
author_facet Adam Pickens
Saptarshi Sengupta
author_sort Adam Pickens
collection DOAJ
description Clustering is a widely used unsupervised learning technique across data mining and machine learning applications and finds frequent use in diverse fields ranging from astronomy, medical imaging, search and optimization, geology, geophysics, and sentiment analysis, to name a few. It is therefore important to verify the effectiveness of the clustering algorithm in question and to make reasonably strong arguments for the acceptance of the end results generated by the validity indices that measure the compactness and separability of clusters. This work aims to explore the successes and limitations of two popular clustering mechanisms by comparing their performance over publicly available benchmarking data sets that capture a variety of data point distributions as well as the number of attributes, especially from a computational point of view by incorporating techniques that alleviate some of the issues that plague these algorithms. Sensitivity to initialization conditions and stagnation to local minima are explored. Further, an implementation of a feedforward neural network utilizing a fully connected topology in particle swarm optimization is introduced. This serves to be a guided random search technique for the neural network weight optimization. The algorithms utilized here are studied and compared, from which their applications are explored. The study aims to provide a handy reference for practitioners to both learn about and verify benchmarking results on commonly used real-world data sets from both a supervised and unsupervised point of view before application in more tailored, complex problems.
first_indexed 2024-03-10T07:29:40Z
format Article
id doaj.art-1736fe9fa5034c2b91cd927ac142af6a
institution Directory Open Access Journal
issn 2504-4990
language English
last_indexed 2024-03-10T07:29:40Z
publishDate 2021-08-01
publisher MDPI AG
record_format Article
series Machine Learning and Knowledge Extraction
spelling doaj.art-1736fe9fa5034c2b91cd927ac142af6a2023-11-22T13:58:16ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902021-08-013369571910.3390/make3030035Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural NetworksAdam Pickens0Saptarshi Sengupta1Department of Computer Science and Information Systems, Murray State University, Murray, KY 42071, USADepartment of Computer Science and Information Systems, Murray State University, Murray, KY 42071, USAClustering is a widely used unsupervised learning technique across data mining and machine learning applications and finds frequent use in diverse fields ranging from astronomy, medical imaging, search and optimization, geology, geophysics, and sentiment analysis, to name a few. It is therefore important to verify the effectiveness of the clustering algorithm in question and to make reasonably strong arguments for the acceptance of the end results generated by the validity indices that measure the compactness and separability of clusters. This work aims to explore the successes and limitations of two popular clustering mechanisms by comparing their performance over publicly available benchmarking data sets that capture a variety of data point distributions as well as the number of attributes, especially from a computational point of view by incorporating techniques that alleviate some of the issues that plague these algorithms. Sensitivity to initialization conditions and stagnation to local minima are explored. Further, an implementation of a feedforward neural network utilizing a fully connected topology in particle swarm optimization is introduced. This serves to be a guided random search technique for the neural network weight optimization. The algorithms utilized here are studied and compared, from which their applications are explored. The study aims to provide a handy reference for practitioners to both learn about and verify benchmarking results on commonly used real-world data sets from both a supervised and unsupervised point of view before application in more tailored, complex problems.https://www.mdpi.com/2504-4990/3/3/35evolutionary neural networksparticle swarm optimizationk-meansfuzzy c-meansoptimization
spellingShingle Adam Pickens
Saptarshi Sengupta
Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks
Machine Learning and Knowledge Extraction
evolutionary neural networks
particle swarm optimization
k-means
fuzzy c-means
optimization
title Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks
title_full Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks
title_fullStr Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks
title_full_unstemmed Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks
title_short Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks
title_sort benchmarking studies aimed at clustering and classification tasks using k means fuzzy c means and evolutionary neural networks
topic evolutionary neural networks
particle swarm optimization
k-means
fuzzy c-means
optimization
url https://www.mdpi.com/2504-4990/3/3/35
work_keys_str_mv AT adampickens benchmarkingstudiesaimedatclusteringandclassificationtasksusingkmeansfuzzycmeansandevolutionaryneuralnetworks
AT saptarshisengupta benchmarkingstudiesaimedatclusteringandclassificationtasksusingkmeansfuzzycmeansandevolutionaryneuralnetworks