Near-optimal (euclidean) metric compression

he metric sketching problem is defined as follows. Given a metric on n points, and ϵ > 0, we wish to produce a small size data structure (sketch) that, given any pair of point indices, recovers the distance between the points up to a 1 + ϵ distortion. In this paper we consider metrics induced by...

Full description

Bibliographic Details
Main Authors: Indyk, Piotr, Wagner, Tal
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:en_US
Published: Association for Computing Machinery 2018
Online Access:http://hdl.handle.net/1721.1/115312
https://orcid.org/0000-0002-7983-9524
https://orcid.org/0000-0002-9455-6864
Description
Summary:he metric sketching problem is defined as follows. Given a metric on n points, and ϵ > 0, we wish to produce a small size data structure (sketch) that, given any pair of point indices, recovers the distance between the points up to a 1 + ϵ distortion. In this paper we consider metrics induced by l2 and l1 norms whose spread (the ratio of the diameter to the closest pair distance) is bounded by Φ > 0. A well-known dimensionality reduction theorem due to Johnson and Lindenstrauss yields a sketch of size O(ϵ[superscript −2] log(Φn)n log n), i.e., O(ϵ[superscript −2[] log(Φn)n log n) bits per point. We show that this bound is not optimal, and can be substantially improved to O(ϵ[superscript −2] log(1/ϵ) · log n + log log Φ) bits per point. Furthermore, we show that our bound is tight up to a factor of log(1/ϵ). We also consider sketching of general metrics and provide a sketch of size O(n log(1/ϵ) + log log Φ) bits per point, which we show is optimal.