Fuzzy model-based sparse clustering with multivariate t-mixtures

Model-based clustering technique is an optimal choice for the distribution of data sets and to find the real structure using mixture of probability distributions. Many extensions of model-based clustering algorithms are available in the literature for getting most favorable results but still its cha...

Full description

Bibliographic Details
Main Authors: Wajid Ali, Miin-Shen Yang, Mehboob Ali, Saif Ud-Din
Format: Article
Language:English
Published: Taylor & Francis Group 2023-12-01
Series:Applied Artificial Intelligence
Online Access:http://dx.doi.org/10.1080/08839514.2023.2169299
_version_ 1797684808751513600
author Wajid Ali
Miin-Shen Yang
Mehboob Ali
Saif Ud-Din
author_facet Wajid Ali
Miin-Shen Yang
Mehboob Ali
Saif Ud-Din
author_sort Wajid Ali
collection DOAJ
description Model-based clustering technique is an optimal choice for the distribution of data sets and to find the real structure using mixture of probability distributions. Many extensions of model-based clustering algorithms are available in the literature for getting most favorable results but still its challenging and important research objective for researchers. In the model-based clustering, many proposed methods are based on EM algorithm to overcome its sensitivity and initialization. However, these methods treat data points with feature (variable) components under equal importance, and so cannot distinguish the irrelevant feature components. In most of the cases, there exist some irrelevant features and outliers/noisy points in a data set, upsetting the performance of clustering algorithms. To overcome these issues, we propose a fuzzy model-based t-clustering algorithm using mixture of t-distribution with an $${L_1}$$regularization for the identification and selection of better features. In order to demonstrate its novelty and usefulness, we apply our algorithm on artificial and real data sets. We further used our proposed method on soil data set, which was collected in collaboration with and the assistance of Environmental laboratory Karakoram International University (GB) from various point/places of Gilgit Baltistan, Pakistan. The comparison results validate the novelty and superiority of our newly proposed method for both the simulated and real data sets as well as effectiveness in addressing the weaknesses of existing methods.
first_indexed 2024-03-12T00:35:09Z
format Article
id doaj.art-6ce08386d7104a30975ccf9334706fe7
institution Directory Open Access Journal
issn 0883-9514
1087-6545
language English
last_indexed 2024-03-12T00:35:09Z
publishDate 2023-12-01
publisher Taylor & Francis Group
record_format Article
series Applied Artificial Intelligence
spelling doaj.art-6ce08386d7104a30975ccf9334706fe72023-09-15T10:01:05ZengTaylor & Francis GroupApplied Artificial Intelligence0883-95141087-65452023-12-0137110.1080/08839514.2023.21692992169299Fuzzy model-based sparse clustering with multivariate t-mixturesWajid Ali0Miin-Shen Yang1Mehboob Ali2Saif Ud-Din3Karakoram International UniversityChung Yuan Christian UniversityChung Yuan Christian UniversityKarakoram International UniversityModel-based clustering technique is an optimal choice for the distribution of data sets and to find the real structure using mixture of probability distributions. Many extensions of model-based clustering algorithms are available in the literature for getting most favorable results but still its challenging and important research objective for researchers. In the model-based clustering, many proposed methods are based on EM algorithm to overcome its sensitivity and initialization. However, these methods treat data points with feature (variable) components under equal importance, and so cannot distinguish the irrelevant feature components. In most of the cases, there exist some irrelevant features and outliers/noisy points in a data set, upsetting the performance of clustering algorithms. To overcome these issues, we propose a fuzzy model-based t-clustering algorithm using mixture of t-distribution with an $${L_1}$$regularization for the identification and selection of better features. In order to demonstrate its novelty and usefulness, we apply our algorithm on artificial and real data sets. We further used our proposed method on soil data set, which was collected in collaboration with and the assistance of Environmental laboratory Karakoram International University (GB) from various point/places of Gilgit Baltistan, Pakistan. The comparison results validate the novelty and superiority of our newly proposed method for both the simulated and real data sets as well as effectiveness in addressing the weaknesses of existing methods.http://dx.doi.org/10.1080/08839514.2023.2169299
spellingShingle Wajid Ali
Miin-Shen Yang
Mehboob Ali
Saif Ud-Din
Fuzzy model-based sparse clustering with multivariate t-mixtures
Applied Artificial Intelligence
title Fuzzy model-based sparse clustering with multivariate t-mixtures
title_full Fuzzy model-based sparse clustering with multivariate t-mixtures
title_fullStr Fuzzy model-based sparse clustering with multivariate t-mixtures
title_full_unstemmed Fuzzy model-based sparse clustering with multivariate t-mixtures
title_short Fuzzy model-based sparse clustering with multivariate t-mixtures
title_sort fuzzy model based sparse clustering with multivariate t mixtures
url http://dx.doi.org/10.1080/08839514.2023.2169299
work_keys_str_mv AT wajidali fuzzymodelbasedsparseclusteringwithmultivariatetmixtures
AT miinshenyang fuzzymodelbasedsparseclusteringwithmultivariatetmixtures
AT mehboobali fuzzymodelbasedsparseclusteringwithmultivariatetmixtures
AT saifuddin fuzzymodelbasedsparseclusteringwithmultivariatetmixtures