Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks
Clustering is a widely used unsupervised learning technique across data mining and machine learning applications and finds frequent use in diverse fields ranging from astronomy, medical imaging, search and optimization, geology, geophysics, and sentiment analysis, to name a few. It is therefore impo...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-08-01
|
Series: | Machine Learning and Knowledge Extraction |
Subjects: | |
Online Access: | https://www.mdpi.com/2504-4990/3/3/35 |
_version_ | 1797518423776821248 |
---|---|
author | Adam Pickens Saptarshi Sengupta |
author_facet | Adam Pickens Saptarshi Sengupta |
author_sort | Adam Pickens |
collection | DOAJ |
description | Clustering is a widely used unsupervised learning technique across data mining and machine learning applications and finds frequent use in diverse fields ranging from astronomy, medical imaging, search and optimization, geology, geophysics, and sentiment analysis, to name a few. It is therefore important to verify the effectiveness of the clustering algorithm in question and to make reasonably strong arguments for the acceptance of the end results generated by the validity indices that measure the compactness and separability of clusters. This work aims to explore the successes and limitations of two popular clustering mechanisms by comparing their performance over publicly available benchmarking data sets that capture a variety of data point distributions as well as the number of attributes, especially from a computational point of view by incorporating techniques that alleviate some of the issues that plague these algorithms. Sensitivity to initialization conditions and stagnation to local minima are explored. Further, an implementation of a feedforward neural network utilizing a fully connected topology in particle swarm optimization is introduced. This serves to be a guided random search technique for the neural network weight optimization. The algorithms utilized here are studied and compared, from which their applications are explored. The study aims to provide a handy reference for practitioners to both learn about and verify benchmarking results on commonly used real-world data sets from both a supervised and unsupervised point of view before application in more tailored, complex problems. |
first_indexed | 2024-03-10T07:29:40Z |
format | Article |
id | doaj.art-1736fe9fa5034c2b91cd927ac142af6a |
institution | Directory Open Access Journal |
issn | 2504-4990 |
language | English |
last_indexed | 2024-03-10T07:29:40Z |
publishDate | 2021-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Machine Learning and Knowledge Extraction |
spelling | doaj.art-1736fe9fa5034c2b91cd927ac142af6a2023-11-22T13:58:16ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902021-08-013369571910.3390/make3030035Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural NetworksAdam Pickens0Saptarshi Sengupta1Department of Computer Science and Information Systems, Murray State University, Murray, KY 42071, USADepartment of Computer Science and Information Systems, Murray State University, Murray, KY 42071, USAClustering is a widely used unsupervised learning technique across data mining and machine learning applications and finds frequent use in diverse fields ranging from astronomy, medical imaging, search and optimization, geology, geophysics, and sentiment analysis, to name a few. It is therefore important to verify the effectiveness of the clustering algorithm in question and to make reasonably strong arguments for the acceptance of the end results generated by the validity indices that measure the compactness and separability of clusters. This work aims to explore the successes and limitations of two popular clustering mechanisms by comparing their performance over publicly available benchmarking data sets that capture a variety of data point distributions as well as the number of attributes, especially from a computational point of view by incorporating techniques that alleviate some of the issues that plague these algorithms. Sensitivity to initialization conditions and stagnation to local minima are explored. Further, an implementation of a feedforward neural network utilizing a fully connected topology in particle swarm optimization is introduced. This serves to be a guided random search technique for the neural network weight optimization. The algorithms utilized here are studied and compared, from which their applications are explored. The study aims to provide a handy reference for practitioners to both learn about and verify benchmarking results on commonly used real-world data sets from both a supervised and unsupervised point of view before application in more tailored, complex problems.https://www.mdpi.com/2504-4990/3/3/35evolutionary neural networksparticle swarm optimizationk-meansfuzzy c-meansoptimization |
spellingShingle | Adam Pickens Saptarshi Sengupta Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks Machine Learning and Knowledge Extraction evolutionary neural networks particle swarm optimization k-means fuzzy c-means optimization |
title | Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks |
title_full | Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks |
title_fullStr | Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks |
title_full_unstemmed | Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks |
title_short | Benchmarking Studies Aimed at Clustering and Classification Tasks Using K-Means, Fuzzy C-Means and Evolutionary Neural Networks |
title_sort | benchmarking studies aimed at clustering and classification tasks using k means fuzzy c means and evolutionary neural networks |
topic | evolutionary neural networks particle swarm optimization k-means fuzzy c-means optimization |
url | https://www.mdpi.com/2504-4990/3/3/35 |
work_keys_str_mv | AT adampickens benchmarkingstudiesaimedatclusteringandclassificationtasksusingkmeansfuzzycmeansandevolutionaryneuralnetworks AT saptarshisengupta benchmarkingstudiesaimedatclusteringandclassificationtasksusingkmeansfuzzycmeansandevolutionaryneuralnetworks |