Scalable Clustering Algorithms for Big Data: A Review
Clustering algorithms have become one of the most critical research areas in multiple domains, especially data mining. However, with the massive growth of big data applications in the cloud world, these applications face many challenges and difficulties. Since Big Data refers to an enormous amount o...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9440980/ |
_version_ | 1828406819876241408 |
---|---|
author | Mahmoud A. Mahdi Khalid M. Hosny Ibrahim Elhenawy |
author_facet | Mahmoud A. Mahdi Khalid M. Hosny Ibrahim Elhenawy |
author_sort | Mahmoud A. Mahdi |
collection | DOAJ |
description | Clustering algorithms have become one of the most critical research areas in multiple domains, especially data mining. However, with the massive growth of big data applications in the cloud world, these applications face many challenges and difficulties. Since Big Data refers to an enormous amount of data, most traditional clustering algorithms come with high computational costs. Hence, the research question is how to handle this volume of data and get accurate results at a critical time. Despite ongoing research work to develop different algorithms to facilitate complex clustering processes, there are still many difficulties that arise while dealing with a large volume of data. In this paper, we review the most relevant clustering algorithms in a categorized manner, provide a comparison of clustering methods for large-scale data and explain the overall challenges based on clustering type. The key idea of the paper is to highlight the main advantages and disadvantages of clustering algorithms for dealing with big data in a scalable approach behind the different other features. |
first_indexed | 2024-12-10T11:17:17Z |
format | Article |
id | doaj.art-4e17569c7c154a18981cd1bb853c0ade |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-10T11:17:17Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-4e17569c7c154a18981cd1bb853c0ade2022-12-22T01:51:06ZengIEEEIEEE Access2169-35362021-01-019800158002710.1109/ACCESS.2021.30840579440980Scalable Clustering Algorithms for Big Data: A ReviewMahmoud A. Mahdi0https://orcid.org/0000-0002-7810-7006Khalid M. Hosny1https://orcid.org/0000-0001-8065-8977Ibrahim Elhenawy2https://orcid.org/0000-0001-7630-1983Faculty of Computers and Information, Zagazig University, Zagazig, EgyptFaculty of Computers and Information, Zagazig University, Zagazig, EgyptFaculty of Computers and Information, Zagazig University, Zagazig, EgyptClustering algorithms have become one of the most critical research areas in multiple domains, especially data mining. However, with the massive growth of big data applications in the cloud world, these applications face many challenges and difficulties. Since Big Data refers to an enormous amount of data, most traditional clustering algorithms come with high computational costs. Hence, the research question is how to handle this volume of data and get accurate results at a critical time. Despite ongoing research work to develop different algorithms to facilitate complex clustering processes, there are still many difficulties that arise while dealing with a large volume of data. In this paper, we review the most relevant clustering algorithms in a categorized manner, provide a comparison of clustering methods for large-scale data and explain the overall challenges based on clustering type. The key idea of the paper is to highlight the main advantages and disadvantages of clustering algorithms for dealing with big data in a scalable approach behind the different other features.https://ieeexplore.ieee.org/document/9440980/Clusteringunsupervised learningtraditional clusteringparallel clusteringstream clusteringhigh dimensional data |
spellingShingle | Mahmoud A. Mahdi Khalid M. Hosny Ibrahim Elhenawy Scalable Clustering Algorithms for Big Data: A Review IEEE Access Clustering unsupervised learning traditional clustering parallel clustering stream clustering high dimensional data |
title | Scalable Clustering Algorithms for Big Data: A Review |
title_full | Scalable Clustering Algorithms for Big Data: A Review |
title_fullStr | Scalable Clustering Algorithms for Big Data: A Review |
title_full_unstemmed | Scalable Clustering Algorithms for Big Data: A Review |
title_short | Scalable Clustering Algorithms for Big Data: A Review |
title_sort | scalable clustering algorithms for big data a review |
topic | Clustering unsupervised learning traditional clustering parallel clustering stream clustering high dimensional data |
url | https://ieeexplore.ieee.org/document/9440980/ |
work_keys_str_mv | AT mahmoudamahdi scalableclusteringalgorithmsforbigdataareview AT khalidmhosny scalableclusteringalgorithmsforbigdataareview AT ibrahimelhenawy scalableclusteringalgorithmsforbigdataareview |