Block Heavy Hitters

e study a natural generalization of the heavy hitters problem in thestreaming context. We term this generalization *block heavy hitters* and define it as follows. We are to stream over a matrix$A$, and report all *rows* that are heavy, where a row is heavy ifits ell_1-norm is at least phi fraction o...

Full description

Bibliographic Details
Main Authors: Andoni, Alexandr, Ba, Khanh Do, Indyk, Piotr
Other Authors: Piotr Indyk
Published: 2008
Online Access:http://hdl.handle.net/1721.1/41514
_version_ 1826195068883566592
author Andoni, Alexandr
Ba, Khanh Do
Indyk, Piotr
author2 Piotr Indyk
author_facet Piotr Indyk
Andoni, Alexandr
Ba, Khanh Do
Indyk, Piotr
author_sort Andoni, Alexandr
collection MIT
description e study a natural generalization of the heavy hitters problem in thestreaming context. We term this generalization *block heavy hitters* and define it as follows. We are to stream over a matrix$A$, and report all *rows* that are heavy, where a row is heavy ifits ell_1-norm is at least phi fraction of the ell_1 norm ofthe entire matrix $A$. In comparison, in the standard heavy hittersproblem, we are required to report the matrix *entries* that areheavy. As is common in streaming, we solve the problem approximately:we return all rows with weight at least phi, but also possibly someother rows that have weight no less than (1-eps)phi. To solve theblock heavy hitters problem, we show how to construct a linear sketchof A from which we can recover the heavy rows of A.The block heavy hitters problem has already found applications forother streaming problems. In particular, it is a crucial buildingblock in a streaming algorithm that constructs asmall-size sketch for the Ulam metric, a metric on non-repetitivestrings under the edit (Levenshtein) distance.
first_indexed 2024-09-23T10:06:23Z
id mit-1721.1/41514
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T10:06:23Z
publishDate 2008
record_format dspace
spelling mit-1721.1/415142019-04-11T06:53:42Z Block Heavy Hitters Andoni, Alexandr Ba, Khanh Do Indyk, Piotr Piotr Indyk Theory of Computation e study a natural generalization of the heavy hitters problem in thestreaming context. We term this generalization *block heavy hitters* and define it as follows. We are to stream over a matrix$A$, and report all *rows* that are heavy, where a row is heavy ifits ell_1-norm is at least phi fraction of the ell_1 norm ofthe entire matrix $A$. In comparison, in the standard heavy hittersproblem, we are required to report the matrix *entries* that areheavy. As is common in streaming, we solve the problem approximately:we return all rows with weight at least phi, but also possibly someother rows that have weight no less than (1-eps)phi. To solve theblock heavy hitters problem, we show how to construct a linear sketchof A from which we can recover the heavy rows of A.The block heavy hitters problem has already found applications forother streaming problems. In particular, it is a crucial buildingblock in a streaming algorithm that constructs asmall-size sketch for the Ulam metric, a metric on non-repetitivestrings under the edit (Levenshtein) distance. 2008-05-05T15:45:27Z 2008-05-05T15:45:27Z 2008-05-02 MIT-CSAIL-TR-2008-024 http://hdl.handle.net/1721.1/41514 Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory 3 p. application/pdf application/postscript
spellingShingle Andoni, Alexandr
Ba, Khanh Do
Indyk, Piotr
Block Heavy Hitters
title Block Heavy Hitters
title_full Block Heavy Hitters
title_fullStr Block Heavy Hitters
title_full_unstemmed Block Heavy Hitters
title_short Block Heavy Hitters
title_sort block heavy hitters
url http://hdl.handle.net/1721.1/41514
work_keys_str_mv AT andonialexandr blockheavyhitters
AT bakhanhdo blockheavyhitters
AT indykpiotr blockheavyhitters