Handling data-skewness in character based string similarity join using Hadoop

The scalability of similarity joins is threatened by the unexpected data characteristic of data skewness. This is a pervasive problem in scientific data. Due to skewness, the uneven distribution of attributes occurs, and it can cause a severe load imbalance problem. When database join operations are...

Full description

Bibliographic Details
Main Authors: Kanak Meena, Devendra K. Tayal, Oscar Castillo, Amita Jain
Format: Article
Language:English
Published: Emerald Publishing 2022-03-01
Series:Applied Computing and Informatics
Subjects:
Online Access:https://www.emerald.com/insight/content/doi/10.1016/j.aci.2018.11.001/full/pdf