Simultaneous nearest neighbor search

Motivated by applications in computer vision and databases, we introduce and study the Simultaneous Nearest Neighbor Search (SNN) problem. Given a set of data points, the goal of SNN is to design a data structure that, given a collection of queries, finds a collection of close points that are compat...

Full description

Bibliographic Details
Main Authors: Kleinberg, Robert, Yuan, Yang, Indyk, Piotr, Mahabadi, Sepideh
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Dagstuhl Publishing 2017
Online Access:http://hdl.handle.net/1721.1/111963
https://orcid.org/0000-0002-7983-9524
https://orcid.org/0000-0001-5004-8991
_version_ 1826203435142217728
author Kleinberg, Robert
Yuan, Yang
Indyk, Piotr
Mahabadi, Sepideh
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Kleinberg, Robert
Yuan, Yang
Indyk, Piotr
Mahabadi, Sepideh
author_sort Kleinberg, Robert
collection MIT
description Motivated by applications in computer vision and databases, we introduce and study the Simultaneous Nearest Neighbor Search (SNN) problem. Given a set of data points, the goal of SNN is to design a data structure that, given a collection of queries, finds a collection of close points that are compatible with each other. Formally, we are given k query points Q=q_1,...,q_k, and a compatibility graph G with vertices in Q, and the goal is to return data points p_1,...,p_k that minimize (i) the weighted sum of the distances from q_i to p_i and (ii) the weighted sum, over all edges (i,j) in the compatibility graph G, of the distances between p_i and p_j. The problem has several applications in computer vision and databases, where one wants to return a set of *consistent* answers to multiple related queries. Furthermore, it generalizes several well-studied computational problems, including Nearest Neighbor Search, Aggregate Nearest Neighbor Search and the 0-extension problem. In this paper we propose and analyze the following general two-step method for designing efficient data structures for SNN. In the first step, for each query point q_i we find its (approximate) nearest neighbor point p'_i; this can be done efficiently using existing approximate nearest neighbor structures. In the second step, we solve an off-line optimization problem over sets q_1,...,q_k and p'_1,...,p'_k; this can be done efficiently given that k is much smaller than n. Even though p'_1,...,p'_k might not constitute the optimal answers to queries q_1,...,q_k, we show that, for the unweighted case, the resulting algorithm satisfies a O(log k/log log k)-approximation guarantee. Furthermore, we show that the approximation factor can be in fact reduced to a constant for compatibility graphs frequently occurring in practice, e.g., 2D grids, 3D grids or planar graphs. Finally, we validate our theoretical results by preliminary experiments. In particular, we show that the empirical approximation factor provided by the above approach is very close to 1.
first_indexed 2024-09-23T12:36:37Z
format Article
id mit-1721.1/111963
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T12:36:37Z
publishDate 2017
publisher Dagstuhl Publishing
record_format dspace
spelling mit-1721.1/1119632024-06-27T14:31:23Z Simultaneous nearest neighbor search Kleinberg, Robert Yuan, Yang Indyk, Piotr Mahabadi, Sepideh Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Indyk, Piotr Mahabadi, Sepideh Motivated by applications in computer vision and databases, we introduce and study the Simultaneous Nearest Neighbor Search (SNN) problem. Given a set of data points, the goal of SNN is to design a data structure that, given a collection of queries, finds a collection of close points that are compatible with each other. Formally, we are given k query points Q=q_1,...,q_k, and a compatibility graph G with vertices in Q, and the goal is to return data points p_1,...,p_k that minimize (i) the weighted sum of the distances from q_i to p_i and (ii) the weighted sum, over all edges (i,j) in the compatibility graph G, of the distances between p_i and p_j. The problem has several applications in computer vision and databases, where one wants to return a set of *consistent* answers to multiple related queries. Furthermore, it generalizes several well-studied computational problems, including Nearest Neighbor Search, Aggregate Nearest Neighbor Search and the 0-extension problem. In this paper we propose and analyze the following general two-step method for designing efficient data structures for SNN. In the first step, for each query point q_i we find its (approximate) nearest neighbor point p'_i; this can be done efficiently using existing approximate nearest neighbor structures. In the second step, we solve an off-line optimization problem over sets q_1,...,q_k and p'_1,...,p'_k; this can be done efficiently given that k is much smaller than n. Even though p'_1,...,p'_k might not constitute the optimal answers to queries q_1,...,q_k, we show that, for the unweighted case, the resulting algorithm satisfies a O(log k/log log k)-approximation guarantee. Furthermore, we show that the approximation factor can be in fact reduced to a constant for compatibility graphs frequently occurring in practice, e.g., 2D grids, 3D grids or planar graphs. Finally, we validate our theoretical results by preliminary experiments. In particular, we show that the empirical approximation factor provided by the above approach is very close to 1. 2017-10-23T19:34:41Z 2017-10-23T19:34:41Z 2016 Article http://purl.org/eprint/type/ConferencePaper 978-3-95977-009-5 1868-8969 http://hdl.handle.net/1721.1/111963 Piotr Indyk et al. "Simultaneous nearest neighbor search." 32nd International Symposium on Computational Geometry (SoCG 2016) June 14-17, 2016 Boston, Massachusetts, USA, edited by Sandor Fekete and Anna Lubiw, Dagstuhl Publishing, June 2016 © Piotr Indyk, Robert Kleinberg, Sepideh Mahabadi, and Yang Yuan https://orcid.org/0000-0002-7983-9524 https://orcid.org/0000-0001-5004-8991 en_US http://dx.doi.org/10.4230/LIPIcs.SoCG.2016.44 32nd International Symposium on Computational Geometry (SoCG 2016) Creative Commons Attribution 4.0 International License http://creativecommons.org/licenses/by/4.0/ application/pdf Dagstuhl Publishing Dagstuhl Publishing
spellingShingle Kleinberg, Robert
Yuan, Yang
Indyk, Piotr
Mahabadi, Sepideh
Simultaneous nearest neighbor search
title Simultaneous nearest neighbor search
title_full Simultaneous nearest neighbor search
title_fullStr Simultaneous nearest neighbor search
title_full_unstemmed Simultaneous nearest neighbor search
title_short Simultaneous nearest neighbor search
title_sort simultaneous nearest neighbor search
url http://hdl.handle.net/1721.1/111963
https://orcid.org/0000-0002-7983-9524
https://orcid.org/0000-0001-5004-8991
work_keys_str_mv AT kleinbergrobert simultaneousnearestneighborsearch
AT yuanyang simultaneousnearestneighborsearch
AT indykpiotr simultaneousnearestneighborsearch
AT mahabadisepideh simultaneousnearestneighborsearch