Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching

We initiate the study of trade-offs between sparsity and the number of measurements in sparse recovery schemes for generic norms. Specifically for a norm ||·||, sparsity parameter k, approximation factor K > 0, and probability of failure P > 0, we ask: what is the minimal value of m so that th...

Full description

Bibliographic Details
Main Authors: Woodruff, David P., Backurs, Arturs, Indyk, Piotr, Razenshteyn, Ilya
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Association for Computing Machinery 2018
Online Access:http://hdl.handle.net/1721.1/113845
https://orcid.org/0000-0001-7546-6313
https://orcid.org/0000-0002-7983-9524
https://orcid.org/0000-0002-3962-721X
_version_ 1826191925307244544
author Woodruff, David P.
Backurs, Arturs
Indyk, Piotr
Razenshteyn, Ilya
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Woodruff, David P.
Backurs, Arturs
Indyk, Piotr
Razenshteyn, Ilya
author_sort Woodruff, David P.
collection MIT
description We initiate the study of trade-offs between sparsity and the number of measurements in sparse recovery schemes for generic norms. Specifically for a norm ||·||, sparsity parameter k, approximation factor K > 0, and probability of failure P > 0, we ask: what is the minimal value of m so that there is a distribution over m × n matrices A with the property that for any x, given Ax, we can recover a k-sparse approximation to x in the given norm with probability at least 1 -- P? We give a partial answer to this problem, by showing that for norms that admit efficient linear sketches, the optimal number of measurements m is closely related to the doubling dimension of the metric induced by the norm ||·|| on the set of all k-sparse vectors. By applying our result to specific norms, we cast known measurement bounds in our general framework (for the [subscript ℓ]p norms, p ∈ [1, 2]) as well as provide new, measurement-efficient schemes (for the Earth-Mover Distance norm). The latter result directly implies more succinct linear sketches for the well-studied planar k-median clustering problem. Finally our lower bound for the doubling dimension of the EMD norm enables us to resolve the open question of [Frahling-Sohler, STOC'05] about the space complexity of clustering problems in the dynamic streaming model.
first_indexed 2024-09-23T09:03:26Z
format Article
id mit-1721.1/113845
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T09:03:26Z
publishDate 2018
publisher Association for Computing Machinery
record_format dspace
spelling mit-1721.1/1138452022-09-26T10:10:49Z Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching Woodruff, David P. Backurs, Arturs Indyk, Piotr Razenshteyn, Ilya Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Backurs, Arturs Indyk, Piotr Razenshteyn, Ilya We initiate the study of trade-offs between sparsity and the number of measurements in sparse recovery schemes for generic norms. Specifically for a norm ||·||, sparsity parameter k, approximation factor K > 0, and probability of failure P > 0, we ask: what is the minimal value of m so that there is a distribution over m × n matrices A with the property that for any x, given Ax, we can recover a k-sparse approximation to x in the given norm with probability at least 1 -- P? We give a partial answer to this problem, by showing that for norms that admit efficient linear sketches, the optimal number of measurements m is closely related to the doubling dimension of the metric induced by the norm ||·|| on the set of all k-sparse vectors. By applying our result to specific norms, we cast known measurement bounds in our general framework (for the [subscript ℓ]p norms, p ∈ [1, 2]) as well as provide new, measurement-efficient schemes (for the Earth-Mover Distance norm). The latter result directly implies more succinct linear sketches for the well-studied planar k-median clustering problem. Finally our lower bound for the doubling dimension of the EMD norm enables us to resolve the open question of [Frahling-Sohler, STOC'05] about the space complexity of clustering problems in the dynamic streaming model. 2018-02-20T21:06:25Z 2018-02-20T21:06:25Z 2016-01 Article http://purl.org/eprint/type/ConferencePaper 978-1-611974-33-1 http://hdl.handle.net/1721.1/113845 Backurs, Arturs et al. "Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching." SODA '16 Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms, 20-12 January, 2016, Philadelphia, Pennsylvania, Association of Computing Machinery, 2016, pp. 318-337. https://orcid.org/0000-0001-7546-6313 https://orcid.org/0000-0002-7983-9524 https://orcid.org/0000-0002-3962-721X en_US http://dl.acm.org/citation.cfm?id=2884459 SODA '16 Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery arXiv
spellingShingle Woodruff, David P.
Backurs, Arturs
Indyk, Piotr
Razenshteyn, Ilya
Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching
title Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching
title_full Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching
title_fullStr Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching
title_full_unstemmed Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching
title_short Nearly-optimal bounds for sparse recovery in generic norms, with applications to k-median sketching
title_sort nearly optimal bounds for sparse recovery in generic norms with applications to k median sketching
url http://hdl.handle.net/1721.1/113845
https://orcid.org/0000-0001-7546-6313
https://orcid.org/0000-0002-7983-9524
https://orcid.org/0000-0002-3962-721X
work_keys_str_mv AT woodruffdavidp nearlyoptimalboundsforsparserecoveryingenericnormswithapplicationstokmediansketching
AT backursarturs nearlyoptimalboundsforsparserecoveryingenericnormswithapplicationstokmediansketching
AT indykpiotr nearlyoptimalboundsforsparserecoveryingenericnormswithapplicationstokmediansketching
AT razenshteynilya nearlyoptimalboundsforsparserecoveryingenericnormswithapplicationstokmediansketching