Handling data-skewness in character based string similarity join using Hadoop
The scalability of similarity joins is threatened by the unexpected data characteristic of data skewness. This is a pervasive problem in scientific data. Due to skewness, the uneven distribution of attributes occurs, and it can cause a severe load imbalance problem. When database join operations are...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Emerald Publishing
2022-03-01
|
Series: | Applied Computing and Informatics |
Subjects: | |
Online Access: | https://www.emerald.com/insight/content/doi/10.1016/j.aci.2018.11.001/full/pdf |