Block Heavy Hitters
e study a natural generalization of the heavy hitters problem in thestreaming context. We term this generalization *block heavy hitters* and define it as follows. We are to stream over a matrix$A$, and report all *rows* that are heavy, where a row is heavy ifits ell_1-norm is at least phi fraction o...
Main Authors: | , , |
---|---|
Other Authors: | |
Published: |
2008
|
Online Access: | http://hdl.handle.net/1721.1/41514 |
_version_ | 1826195068883566592 |
---|---|
author | Andoni, Alexandr Ba, Khanh Do Indyk, Piotr |
author2 | Piotr Indyk |
author_facet | Piotr Indyk Andoni, Alexandr Ba, Khanh Do Indyk, Piotr |
author_sort | Andoni, Alexandr |
collection | MIT |
description | e study a natural generalization of the heavy hitters problem in thestreaming context. We term this generalization *block heavy hitters* and define it as follows. We are to stream over a matrix$A$, and report all *rows* that are heavy, where a row is heavy ifits ell_1-norm is at least phi fraction of the ell_1 norm ofthe entire matrix $A$. In comparison, in the standard heavy hittersproblem, we are required to report the matrix *entries* that areheavy. As is common in streaming, we solve the problem approximately:we return all rows with weight at least phi, but also possibly someother rows that have weight no less than (1-eps)phi. To solve theblock heavy hitters problem, we show how to construct a linear sketchof A from which we can recover the heavy rows of A.The block heavy hitters problem has already found applications forother streaming problems. In particular, it is a crucial buildingblock in a streaming algorithm that constructs asmall-size sketch for the Ulam metric, a metric on non-repetitivestrings under the edit (Levenshtein) distance. |
first_indexed | 2024-09-23T10:06:23Z |
id | mit-1721.1/41514 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T10:06:23Z |
publishDate | 2008 |
record_format | dspace |
spelling | mit-1721.1/415142019-04-11T06:53:42Z Block Heavy Hitters Andoni, Alexandr Ba, Khanh Do Indyk, Piotr Piotr Indyk Theory of Computation e study a natural generalization of the heavy hitters problem in thestreaming context. We term this generalization *block heavy hitters* and define it as follows. We are to stream over a matrix$A$, and report all *rows* that are heavy, where a row is heavy ifits ell_1-norm is at least phi fraction of the ell_1 norm ofthe entire matrix $A$. In comparison, in the standard heavy hittersproblem, we are required to report the matrix *entries* that areheavy. As is common in streaming, we solve the problem approximately:we return all rows with weight at least phi, but also possibly someother rows that have weight no less than (1-eps)phi. To solve theblock heavy hitters problem, we show how to construct a linear sketchof A from which we can recover the heavy rows of A.The block heavy hitters problem has already found applications forother streaming problems. In particular, it is a crucial buildingblock in a streaming algorithm that constructs asmall-size sketch for the Ulam metric, a metric on non-repetitivestrings under the edit (Levenshtein) distance. 2008-05-05T15:45:27Z 2008-05-05T15:45:27Z 2008-05-02 MIT-CSAIL-TR-2008-024 http://hdl.handle.net/1721.1/41514 Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory 3 p. application/pdf application/postscript |
spellingShingle | Andoni, Alexandr Ba, Khanh Do Indyk, Piotr Block Heavy Hitters |
title | Block Heavy Hitters |
title_full | Block Heavy Hitters |
title_fullStr | Block Heavy Hitters |
title_full_unstemmed | Block Heavy Hitters |
title_short | Block Heavy Hitters |
title_sort | block heavy hitters |
url | http://hdl.handle.net/1721.1/41514 |
work_keys_str_mv | AT andonialexandr blockheavyhitters AT bakhanhdo blockheavyhitters AT indykpiotr blockheavyhitters |