Efficient spatial data partitioning for distributed $$k$$ k NN joins
Abstract Parallel processing of large spatial datasets over distributed systems has become a core part of modern data analytic systems like Apache Hadoop and Apache Spark. The general-purpose design of these systems does not natively account for the data’s spatial attributes and results in poor scal...
Main Authors: | Ayman Zeidan, Huy T. Vo |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2022-06-01
|
Series: | Journal of Big Data |
Subjects: | |
Online Access: | https://doi.org/10.1186/s40537-022-00587-2 |
Similar Items
-
R*-Grove: Balanced Spatial Partitioning for Large-Scale Datasets
by: Tin Vu, et al.
Published: (2020-08-01) -
Efficient Group <i>K</i> Nearest-Neighbor Spatial Query Processing in Apache Spark
by: Panagiotis Moutafis, et al.
Published: (2021-11-01) -
CoPart: a context-based partitioning technique for big data
by: Sara Migliorini, et al.
Published: (2021-01-01) -
A PID-Based kNN Query Processing Algorithm for Spatial Data
by: Baiyou Qiao, et al.
Published: (2022-10-01) -
Trajectory Clustering and <i>k</i>-NN for Robust Privacy Preserving <i>k</i>-NN Query Processing in GeoSpark
by: Elias Dritsas, et al.
Published: (2020-07-01)