Pseudo-deterministic streaming

A pseudo-deterministic algorithm is a (randomized) algorithm which, when run multiple times on the same input, with high probability outputs the same result on all executions. Classic streaming algorithms, such as those for finding heavy hitters, approximate counting, `2 approximation, finding a non...

Full description

Bibliographic Details
Main Authors: Goldwasser, Shafrira, Grossman, Ofer.
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:English
Published: 2021
Online Access:https://hdl.handle.net/1721.1/129560
_version_ 1811092220346892288
author Goldwasser, Shafrira
Grossman, Ofer.
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Goldwasser, Shafrira
Grossman, Ofer.
author_sort Goldwasser, Shafrira
collection MIT
description A pseudo-deterministic algorithm is a (randomized) algorithm which, when run multiple times on the same input, with high probability outputs the same result on all executions. Classic streaming algorithms, such as those for finding heavy hitters, approximate counting, `2 approximation, finding a nonzero entry in a vector (for turnstile algorithms) are not pseudo-deterministic. For example, in the instance of finding a nonzero entry in a vector, for any known low-space algorithm A, there exists a stream x so that running A twice on x (using different randomness) would with high probability result in two different entries as the output. In this work, we study whether it is inherent that these algorithms output different values on different executions. That is, we ask whether these problems have low-memory pseudo-deterministic algorithms. For instance, we show that there is no low-memory pseudo-deterministic algorithm for finding a nonzero entry in a vector (given in a turnstile fashion), and also that there is no low-dimensional pseudo-deterministic sketching algorithm for `2 norm estimation. We also exhibit problems which do have low memory pseudo-deterministic algorithms but no low memory deterministic algorithm, such as outputting a nonzero row of a matrix, or outputting a basis for the row-span of a matrix. We also investigate multi-pseudo-deterministic algorithms: algorithms which with high probability output one of a few options. We show the first lower bounds for such algorithms. This implies that there are streaming problems such that every low space algorithm for the problem must have inputs where there are many valid outputs, all with a significant probability of being outputted.
first_indexed 2024-09-23T15:15:00Z
format Article
id mit-1721.1/129560
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T15:15:00Z
publishDate 2021
record_format dspace
spelling mit-1721.1/1295602022-09-29T13:38:46Z Pseudo-deterministic streaming Goldwasser, Shafrira Grossman, Ofer. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science A pseudo-deterministic algorithm is a (randomized) algorithm which, when run multiple times on the same input, with high probability outputs the same result on all executions. Classic streaming algorithms, such as those for finding heavy hitters, approximate counting, `2 approximation, finding a nonzero entry in a vector (for turnstile algorithms) are not pseudo-deterministic. For example, in the instance of finding a nonzero entry in a vector, for any known low-space algorithm A, there exists a stream x so that running A twice on x (using different randomness) would with high probability result in two different entries as the output. In this work, we study whether it is inherent that these algorithms output different values on different executions. That is, we ask whether these problems have low-memory pseudo-deterministic algorithms. For instance, we show that there is no low-memory pseudo-deterministic algorithm for finding a nonzero entry in a vector (given in a turnstile fashion), and also that there is no low-dimensional pseudo-deterministic sketching algorithm for `2 norm estimation. We also exhibit problems which do have low memory pseudo-deterministic algorithms but no low memory deterministic algorithm, such as outputting a nonzero row of a matrix, or outputting a basis for the row-span of a matrix. We also investigate multi-pseudo-deterministic algorithms: algorithms which with high probability output one of a few options. We show the first lower bounds for such algorithms. This implies that there are streaming problems such that every low space algorithm for the problem must have inputs where there are many valid outputs, all with a significant probability of being outputted. National Science Foundation (U.S.) (Grant CNS-1413920) United States. Defense Advanced Research Projects Agency (Grant DARPA/NJIT 491512803) Alfred P. Sloan Foundation (Grant 996698) MIT/IBM (Grant W1771646) 2021-01-26T13:14:34Z 2021-01-26T13:14:34Z 2019-11 2020-12-15T18:03:42Z Article http://purl.org/eprint/type/ConferencePaper 1868-8969 https://hdl.handle.net/1721.1/129560 Goldwasser, Shafi et al. “Pseudo-deterministic streaming.” Leibniz International Proceedings in Informatics, LIPIcs, 151 (November 2019): 9:1–79:25 © 2019 The Author(s) en 10.4230/LIPIcs.ITCS.2020.79 Leibniz International Proceedings in Informatics, LIPIcs Creative Commons Attribution 3.0 unported license https://creativecommons.org/licenses/by/3.0/ application/pdf DROPS
spellingShingle Goldwasser, Shafrira
Grossman, Ofer.
Pseudo-deterministic streaming
title Pseudo-deterministic streaming
title_full Pseudo-deterministic streaming
title_fullStr Pseudo-deterministic streaming
title_full_unstemmed Pseudo-deterministic streaming
title_short Pseudo-deterministic streaming
title_sort pseudo deterministic streaming
url https://hdl.handle.net/1721.1/129560
work_keys_str_mv AT goldwassershafrira pseudodeterministicstreaming
AT grossmanofer pseudodeterministicstreaming