Distributed Similarity Queries in Metric Spaces

Abstract Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this pap...

Full description

Bibliographic Details
Main Authors: Keyu Yang, Xin Ding, Yuanliang Zhang, Lu Chen, Baihua Zheng, Yunjun Gao
Format: Article
Language:English
Published: SpringerOpen 2019-06-01
Series:Data Science and Engineering
Subjects:
Online Access:http://link.springer.com/article/10.1007/s41019-019-0095-7
Description
Summary:Abstract Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), to support efficient metric similarity queries in the distributed environment. AMDS uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronous process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In addition, we develop efficient similarity search algorithms using AMDS. Extensive experiments using real and synthetic data demonstrate the performance of metric similarity queries using AMDS. Moreover, the AMDS scales sublinearly with the growing data size.
ISSN:2364-1185
2364-1541