Efficient spatial data partitioning for distributed $$k$$ k NN joins

Abstract Parallel processing of large spatial datasets over distributed systems has become a core part of modern data analytic systems like Apache Hadoop and Apache Spark. The general-purpose design of these systems does not natively account for the data’s spatial attributes and results in poor scal...

Full description

Bibliographic Details
Main Authors: Ayman Zeidan, Huy T. Vo
Format: Article
Language:English
Published: SpringerOpen 2022-06-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-022-00587-2