A buffer-based online clustering for evolving data stream
Data stream clustering plays an important role in data stream mining for knowledge extraction. Numerous researchers have recently studied density-based clustering algorithms due to their capability to generate arbitrarily shaped clusters. However, most of the algorithms are either fully offline, hyb...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier Ltd
2019
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/24676/1/A%20buffer-based%20online%20clustering%20for%20evolving%20data%20stream.pdf |
_version_ | 1796993321990619136 |
---|---|
author | Islam, Md. Kamrul Ahmed, Md. Manjur Kamal Z., Zamli |
author_facet | Islam, Md. Kamrul Ahmed, Md. Manjur Kamal Z., Zamli |
author_sort | Islam, Md. Kamrul |
collection | UMP |
description | Data stream clustering plays an important role in data stream mining for knowledge extraction. Numerous researchers have recently studied density-based clustering algorithms due to their capability to generate arbitrarily shaped clusters. However, most of the algorithms are either fully offline, hybrid online/offline, or cannot handle the property of evolving data stream. Recently, a fully online clustering algorithm for evolving data stream called CEDAS was proposed. However, similar to other density-based clustering algorithms, CEDAS requires predefining the global optimal radius of micro-clusters, which is a difficult task; in addition, an erroneous choice deteriorates cluster performance. Moreover, the algorithm ignores the presence of temporarily irrelevant micro-clusters, which may be relevant in the future. In this study, we present a fully online density-based clustering algorithm called buffer-based online clustering for evolving data stream (BOCEDS). This algorithm recursively updates the micro-cluster radius to its local optimal. It also introduces a buffer for storing irrelevant micro-clusters and a fully online pruning method for extracting the temporarily irrelevant micro-cluster from the buffer. In addition, BOCEDS proposes an online micro-cluster energy-updating function based on the spatial information of the data stream. Experimental results are compared with those of CEDAS and other alternative hybrid online/offline density-based clustering algorithms, and BOCEDS proves its superiority over the other clustering algorithms. The sensitivity of clustering parameters is also measured. The proposed algorithm is then applied to real-world weather data streams to demonstrate its capability to detect changes in data stream and discover arbitrarily shaped clusters. The proposed BOCEDS can be available in https://sites.google.com/view/md-manjur-ahmed and https://sites.google.com/view/kamrul-just. |
first_indexed | 2024-03-06T12:32:21Z |
format | Article |
id | UMPir24676 |
institution | Universiti Malaysia Pahang |
language | English |
last_indexed | 2024-03-06T12:32:21Z |
publishDate | 2019 |
publisher | Elsevier Ltd |
record_format | dspace |
spelling | UMPir246762019-04-02T07:34:52Z http://umpir.ump.edu.my/id/eprint/24676/ A buffer-based online clustering for evolving data stream Islam, Md. Kamrul Ahmed, Md. Manjur Kamal Z., Zamli QA75 Electronic computers. Computer science Data stream clustering plays an important role in data stream mining for knowledge extraction. Numerous researchers have recently studied density-based clustering algorithms due to their capability to generate arbitrarily shaped clusters. However, most of the algorithms are either fully offline, hybrid online/offline, or cannot handle the property of evolving data stream. Recently, a fully online clustering algorithm for evolving data stream called CEDAS was proposed. However, similar to other density-based clustering algorithms, CEDAS requires predefining the global optimal radius of micro-clusters, which is a difficult task; in addition, an erroneous choice deteriorates cluster performance. Moreover, the algorithm ignores the presence of temporarily irrelevant micro-clusters, which may be relevant in the future. In this study, we present a fully online density-based clustering algorithm called buffer-based online clustering for evolving data stream (BOCEDS). This algorithm recursively updates the micro-cluster radius to its local optimal. It also introduces a buffer for storing irrelevant micro-clusters and a fully online pruning method for extracting the temporarily irrelevant micro-cluster from the buffer. In addition, BOCEDS proposes an online micro-cluster energy-updating function based on the spatial information of the data stream. Experimental results are compared with those of CEDAS and other alternative hybrid online/offline density-based clustering algorithms, and BOCEDS proves its superiority over the other clustering algorithms. The sensitivity of clustering parameters is also measured. The proposed algorithm is then applied to real-world weather data streams to demonstrate its capability to detect changes in data stream and discover arbitrarily shaped clusters. The proposed BOCEDS can be available in https://sites.google.com/view/md-manjur-ahmed and https://sites.google.com/view/kamrul-just. Elsevier Ltd 2019 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/24676/1/A%20buffer-based%20online%20clustering%20for%20evolving%20data%20stream.pdf Islam, Md. Kamrul and Ahmed, Md. Manjur and Kamal Z., Zamli (2019) A buffer-based online clustering for evolving data stream. Information Sciences, 489. pp. 113-135. ISSN 0020-0255. (Published) https://doi.org/10.1016/j.ins.2019.03.022 https://doi.org/10.1016/j.ins.2019.03.022 |
spellingShingle | QA75 Electronic computers. Computer science Islam, Md. Kamrul Ahmed, Md. Manjur Kamal Z., Zamli A buffer-based online clustering for evolving data stream |
title | A buffer-based online clustering for evolving data stream |
title_full | A buffer-based online clustering for evolving data stream |
title_fullStr | A buffer-based online clustering for evolving data stream |
title_full_unstemmed | A buffer-based online clustering for evolving data stream |
title_short | A buffer-based online clustering for evolving data stream |
title_sort | buffer based online clustering for evolving data stream |
topic | QA75 Electronic computers. Computer science |
url | http://umpir.ump.edu.my/id/eprint/24676/1/A%20buffer-based%20online%20clustering%20for%20evolving%20data%20stream.pdf |
work_keys_str_mv | AT islammdkamrul abufferbasedonlineclusteringforevolvingdatastream AT ahmedmdmanjur abufferbasedonlineclusteringforevolvingdatastream AT kamalzzamli abufferbasedonlineclusteringforevolvingdatastream AT islammdkamrul bufferbasedonlineclusteringforevolvingdatastream AT ahmedmdmanjur bufferbasedonlineclusteringforevolvingdatastream AT kamalzzamli bufferbasedonlineclusteringforevolvingdatastream |