Practical data-dependent metric compression with provable guarantees
We introduce a new distance-preserving compact representation of multidimensional point-sets. Given n points in a d-dimensional space where each coordinate is represented using B bits (i.e., dB bits per point), it produces a representation of size O(dlog(dB/e) + logn) bits per point from which one c...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Neural Information Processing Systems Foundation, Inc.
2019
|
Online Access: | https://hdl.handle.net/1721.1/121556 |
_version_ | 1826209509874335744 |
---|---|
author | Indyk, Piotr Razenshteyn, Ilya Wagner, Tal |
author2 | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
author_facet | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Indyk, Piotr Razenshteyn, Ilya Wagner, Tal |
author_sort | Indyk, Piotr |
collection | MIT |
description | We introduce a new distance-preserving compact representation of multidimensional point-sets. Given n points in a d-dimensional space where each coordinate is represented using B bits (i.e., dB bits per point), it produces a representation of size O(dlog(dB/e) + logn) bits per point from which one can approximate the distances up to a factor of 1 ± e. Our algorithm almost matches the recent bound of [6] while being much simpler. We compare our algorithm to Product Quantization (PQ) [7], a state of the art heuristic metric compression method. We evaluate both algorithms on several data sets: SIFT (used in [7]), MNIST [11], New York City taxi time series [4] and a synthetic one-dimensional data set embedded in a high-dimensional space. With appropriately tuned parameters, our algorithm produces representations that are comparable to or better than those produced by PQ, while having provable guarantees on its performance. |
first_indexed | 2024-09-23T14:23:36Z |
format | Article |
id | mit-1721.1/121556 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T14:23:36Z |
publishDate | 2019 |
publisher | Neural Information Processing Systems Foundation, Inc. |
record_format | dspace |
spelling | mit-1721.1/1215562022-09-29T09:13:50Z Practical data-dependent metric compression with provable guarantees Indyk, Piotr Razenshteyn, Ilya Wagner, Tal Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science We introduce a new distance-preserving compact representation of multidimensional point-sets. Given n points in a d-dimensional space where each coordinate is represented using B bits (i.e., dB bits per point), it produces a representation of size O(dlog(dB/e) + logn) bits per point from which one can approximate the distances up to a factor of 1 ± e. Our algorithm almost matches the recent bound of [6] while being much simpler. We compare our algorithm to Product Quantization (PQ) [7], a state of the art heuristic metric compression method. We evaluate both algorithms on several data sets: SIFT (used in [7]), MNIST [11], New York City taxi time series [4] and a synthetic one-dimensional data set embedded in a high-dimensional space. With appropriately tuned parameters, our algorithm produces representations that are comparable to or better than those produced by PQ, while having provable guarantees on its performance. 2019-07-09T20:08:57Z 2019-07-09T20:08:57Z 2017-12 2019-05-31T14:38:15Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/121556 Indyk, Piotr, Ilya Razenshteyn, and Tal Wagner. "Practical Data-Dependent Metric Compression with Provable Guarantees." 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, California, USA, 4-9 December, 2017, NIPS, 2017. © Neural Information Processing Systems Foundation, Inc. en https://papers.nips.cc/paper/6855-practical-data-dependent-metric-compression-with-provable-guarantees 31st Conference on Neural Information Processing Systems (NIPS 2017) Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Neural Information Processing Systems Foundation, Inc. Neural Information Processing Systems (NIPS) |
spellingShingle | Indyk, Piotr Razenshteyn, Ilya Wagner, Tal Practical data-dependent metric compression with provable guarantees |
title | Practical data-dependent metric compression with provable guarantees |
title_full | Practical data-dependent metric compression with provable guarantees |
title_fullStr | Practical data-dependent metric compression with provable guarantees |
title_full_unstemmed | Practical data-dependent metric compression with provable guarantees |
title_short | Practical data-dependent metric compression with provable guarantees |
title_sort | practical data dependent metric compression with provable guarantees |
url | https://hdl.handle.net/1721.1/121556 |
work_keys_str_mv | AT indykpiotr practicaldatadependentmetriccompressionwithprovableguarantees AT razenshteynilya practicaldatadependentmetriccompressionwithprovableguarantees AT wagnertal practicaldatadependentmetriccompressionwithprovableguarantees |