Handling data-skewness in character based string similarity join using Hadoop
The scalability of similarity joins is threatened by the unexpected data characteristic of data skewness. This is a pervasive problem in scientific data. Due to skewness, the uneven distribution of attributes occurs, and it can cause a severe load imbalance problem. When database join operations are...
Main Authors: | Kanak Meena, Devendra K. Tayal, Oscar Castillo, Amita Jain |
---|---|
Format: | Article |
Language: | English |
Published: |
Emerald Publishing
2022-03-01
|
Series: | Applied Computing and Informatics |
Subjects: | |
Online Access: | https://www.emerald.com/insight/content/doi/10.1016/j.aci.2018.11.001/full/pdf |
Similar Items
-
Skewness-Based Partitioning in SpatialHadoop
by: Alberto Belussi, et al.
Published: (2020-03-01) -
An analysis of two-way equi-join algorithms under MapReduce
by: Amer F. Al-Badarneh, et al.
Published: (2022-04-01) -
Comparative Analysis of Skew-Join Strategies for Large-Scale Datasets with MapReduce and Spark
by: Anh-Cang Phan, et al.
Published: (2022-06-01) -
Procesamiento de big data en Hadoop usando el repartition join
by: Néstor Iván Escalante Fol, et al.
Published: (2015-06-01) -
Embedding GPU Computations in Hadoop
by: Jie Zhu, et al.
Published: (2014-11-01)