The least sample size essential for detecting changes in clustering solutions of streaming datasets.
The clustering analysis approach treats multivariate data tuples as objects and groups them into clusters based on their similarities or dissimilarities within the dataset. However, in modern world, a significant volume of data is continuously generated from diverse sources over time. In these dynam...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2024-01-01
|
Series: | PLoS ONE |
Online Access: | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0297355&type=printable |
_version_ | 1797296168113274880 |
---|---|
author | Muhammad Atif Muhammad Farooq Mohammad Abiad Muhammad Shafiq |
author_facet | Muhammad Atif Muhammad Farooq Mohammad Abiad Muhammad Shafiq |
author_sort | Muhammad Atif |
collection | DOAJ |
description | The clustering analysis approach treats multivariate data tuples as objects and groups them into clusters based on their similarities or dissimilarities within the dataset. However, in modern world, a significant volume of data is continuously generated from diverse sources over time. In these dynamic scenarios, the data is not static but continually evolves. Consequently, the interesting patterns and inherent subgroups within the datasets also change and develop over time. The researchers have paid special attention to monitoring changes in cluster solutions of evolving streams. For this matter, several algorithms have been proposed in the literature. However, to date, no study has examined the effect of variability in cluster sizes on the evolution of cluster solutions. Moreover, no guidance is available on determining the impact of cluster sizes on the type of changes they experience in the streams. In the present simulation study using artificial datasets, the evolution of clusters is examined concerning the variability in cluster sizes. The findings are substantial because tracing and monitoring the changes in clustering solutions have a wide range of applications in every field of research. This study determines the minimum sample size required in the clustering of time-stamped datasets. |
first_indexed | 2024-03-07T21:59:29Z |
format | Article |
id | doaj.art-d53d1c50f188481bad686970999b8fdc |
institution | Directory Open Access Journal |
issn | 1932-6203 |
language | English |
last_indexed | 2024-03-07T21:59:29Z |
publishDate | 2024-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj.art-d53d1c50f188481bad686970999b8fdc2024-02-24T05:31:44ZengPublic Library of Science (PLoS)PLoS ONE1932-62032024-01-01192e029735510.1371/journal.pone.0297355The least sample size essential for detecting changes in clustering solutions of streaming datasets.Muhammad AtifMuhammad FarooqMohammad AbiadMuhammad ShafiqThe clustering analysis approach treats multivariate data tuples as objects and groups them into clusters based on their similarities or dissimilarities within the dataset. However, in modern world, a significant volume of data is continuously generated from diverse sources over time. In these dynamic scenarios, the data is not static but continually evolves. Consequently, the interesting patterns and inherent subgroups within the datasets also change and develop over time. The researchers have paid special attention to monitoring changes in cluster solutions of evolving streams. For this matter, several algorithms have been proposed in the literature. However, to date, no study has examined the effect of variability in cluster sizes on the evolution of cluster solutions. Moreover, no guidance is available on determining the impact of cluster sizes on the type of changes they experience in the streams. In the present simulation study using artificial datasets, the evolution of clusters is examined concerning the variability in cluster sizes. The findings are substantial because tracing and monitoring the changes in clustering solutions have a wide range of applications in every field of research. This study determines the minimum sample size required in the clustering of time-stamped datasets.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0297355&type=printable |
spellingShingle | Muhammad Atif Muhammad Farooq Mohammad Abiad Muhammad Shafiq The least sample size essential for detecting changes in clustering solutions of streaming datasets. PLoS ONE |
title | The least sample size essential for detecting changes in clustering solutions of streaming datasets. |
title_full | The least sample size essential for detecting changes in clustering solutions of streaming datasets. |
title_fullStr | The least sample size essential for detecting changes in clustering solutions of streaming datasets. |
title_full_unstemmed | The least sample size essential for detecting changes in clustering solutions of streaming datasets. |
title_short | The least sample size essential for detecting changes in clustering solutions of streaming datasets. |
title_sort | least sample size essential for detecting changes in clustering solutions of streaming datasets |
url | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0297355&type=printable |
work_keys_str_mv | AT muhammadatif theleastsamplesizeessentialfordetectingchangesinclusteringsolutionsofstreamingdatasets AT muhammadfarooq theleastsamplesizeessentialfordetectingchangesinclusteringsolutionsofstreamingdatasets AT mohammadabiad theleastsamplesizeessentialfordetectingchangesinclusteringsolutionsofstreamingdatasets AT muhammadshafiq theleastsamplesizeessentialfordetectingchangesinclusteringsolutionsofstreamingdatasets AT muhammadatif leastsamplesizeessentialfordetectingchangesinclusteringsolutionsofstreamingdatasets AT muhammadfarooq leastsamplesizeessentialfordetectingchangesinclusteringsolutionsofstreamingdatasets AT mohammadabiad leastsamplesizeessentialfordetectingchangesinclusteringsolutionsofstreamingdatasets AT muhammadshafiq leastsamplesizeessentialfordetectingchangesinclusteringsolutionsofstreamingdatasets |