Fuzzy model-based sparse clustering with multivariate t-mixtures
Model-based clustering technique is an optimal choice for the distribution of data sets and to find the real structure using mixture of probability distributions. Many extensions of model-based clustering algorithms are available in the literature for getting most favorable results but still its cha...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2023-12-01
|
Series: | Applied Artificial Intelligence |
Online Access: | http://dx.doi.org/10.1080/08839514.2023.2169299 |
_version_ | 1797684808751513600 |
---|---|
author | Wajid Ali Miin-Shen Yang Mehboob Ali Saif Ud-Din |
author_facet | Wajid Ali Miin-Shen Yang Mehboob Ali Saif Ud-Din |
author_sort | Wajid Ali |
collection | DOAJ |
description | Model-based clustering technique is an optimal choice for the distribution of data sets and to find the real structure using mixture of probability distributions. Many extensions of model-based clustering algorithms are available in the literature for getting most favorable results but still its challenging and important research objective for researchers. In the model-based clustering, many proposed methods are based on EM algorithm to overcome its sensitivity and initialization. However, these methods treat data points with feature (variable) components under equal importance, and so cannot distinguish the irrelevant feature components. In most of the cases, there exist some irrelevant features and outliers/noisy points in a data set, upsetting the performance of clustering algorithms. To overcome these issues, we propose a fuzzy model-based t-clustering algorithm using mixture of t-distribution with an $${L_1}$$regularization for the identification and selection of better features. In order to demonstrate its novelty and usefulness, we apply our algorithm on artificial and real data sets. We further used our proposed method on soil data set, which was collected in collaboration with and the assistance of Environmental laboratory Karakoram International University (GB) from various point/places of Gilgit Baltistan, Pakistan. The comparison results validate the novelty and superiority of our newly proposed method for both the simulated and real data sets as well as effectiveness in addressing the weaknesses of existing methods. |
first_indexed | 2024-03-12T00:35:09Z |
format | Article |
id | doaj.art-6ce08386d7104a30975ccf9334706fe7 |
institution | Directory Open Access Journal |
issn | 0883-9514 1087-6545 |
language | English |
last_indexed | 2024-03-12T00:35:09Z |
publishDate | 2023-12-01 |
publisher | Taylor & Francis Group |
record_format | Article |
series | Applied Artificial Intelligence |
spelling | doaj.art-6ce08386d7104a30975ccf9334706fe72023-09-15T10:01:05ZengTaylor & Francis GroupApplied Artificial Intelligence0883-95141087-65452023-12-0137110.1080/08839514.2023.21692992169299Fuzzy model-based sparse clustering with multivariate t-mixturesWajid Ali0Miin-Shen Yang1Mehboob Ali2Saif Ud-Din3Karakoram International UniversityChung Yuan Christian UniversityChung Yuan Christian UniversityKarakoram International UniversityModel-based clustering technique is an optimal choice for the distribution of data sets and to find the real structure using mixture of probability distributions. Many extensions of model-based clustering algorithms are available in the literature for getting most favorable results but still its challenging and important research objective for researchers. In the model-based clustering, many proposed methods are based on EM algorithm to overcome its sensitivity and initialization. However, these methods treat data points with feature (variable) components under equal importance, and so cannot distinguish the irrelevant feature components. In most of the cases, there exist some irrelevant features and outliers/noisy points in a data set, upsetting the performance of clustering algorithms. To overcome these issues, we propose a fuzzy model-based t-clustering algorithm using mixture of t-distribution with an $${L_1}$$regularization for the identification and selection of better features. In order to demonstrate its novelty and usefulness, we apply our algorithm on artificial and real data sets. We further used our proposed method on soil data set, which was collected in collaboration with and the assistance of Environmental laboratory Karakoram International University (GB) from various point/places of Gilgit Baltistan, Pakistan. The comparison results validate the novelty and superiority of our newly proposed method for both the simulated and real data sets as well as effectiveness in addressing the weaknesses of existing methods.http://dx.doi.org/10.1080/08839514.2023.2169299 |
spellingShingle | Wajid Ali Miin-Shen Yang Mehboob Ali Saif Ud-Din Fuzzy model-based sparse clustering with multivariate t-mixtures Applied Artificial Intelligence |
title | Fuzzy model-based sparse clustering with multivariate t-mixtures |
title_full | Fuzzy model-based sparse clustering with multivariate t-mixtures |
title_fullStr | Fuzzy model-based sparse clustering with multivariate t-mixtures |
title_full_unstemmed | Fuzzy model-based sparse clustering with multivariate t-mixtures |
title_short | Fuzzy model-based sparse clustering with multivariate t-mixtures |
title_sort | fuzzy model based sparse clustering with multivariate t mixtures |
url | http://dx.doi.org/10.1080/08839514.2023.2169299 |
work_keys_str_mv | AT wajidali fuzzymodelbasedsparseclusteringwithmultivariatetmixtures AT miinshenyang fuzzymodelbasedsparseclusteringwithmultivariatetmixtures AT mehboobali fuzzymodelbasedsparseclusteringwithmultivariatetmixtures AT saifuddin fuzzymodelbasedsparseclusteringwithmultivariatetmixtures |