A Characterization of Guesswork on Swiftly Tilting Curves

© 1963-2012 IEEE. Given a collection of strings, each with an associated probability of occurrence, the guesswork of each of them is their position in a list ordered from most likely to least likely, breaking ties arbitrarily. The guesswork is central to several applications in information theory: a...

Full description

Bibliographic Details
Main Authors: Beirami, Ahmad, Calderbank, Robert, Christiansen, Mark M, Duffy, Ken R, Medard, Muriel
Other Authors: Massachusetts Institute of Technology. Research Laboratory of Electronics
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers (IEEE) 2021
Online Access:https://hdl.handle.net/1721.1/135856
_version_ 1826195022840594432
author Beirami, Ahmad
Calderbank, Robert
Christiansen, Mark M
Duffy, Ken R
Medard, Muriel
author2 Massachusetts Institute of Technology. Research Laboratory of Electronics
author_facet Massachusetts Institute of Technology. Research Laboratory of Electronics
Beirami, Ahmad
Calderbank, Robert
Christiansen, Mark M
Duffy, Ken R
Medard, Muriel
author_sort Beirami, Ahmad
collection MIT
description © 1963-2012 IEEE. Given a collection of strings, each with an associated probability of occurrence, the guesswork of each of them is their position in a list ordered from most likely to least likely, breaking ties arbitrarily. The guesswork is central to several applications in information theory: average guesswork provides a lower bound on the expected computational cost of a sequential decoder to decode successfully the transmitted message; the complementary cumulative distribution function of guesswork gives the error probability in list decoding; the logarithm of guesswork is the number of bits needed in optimal lossless one-to-one source coding; and the guesswork is the number of trials required of an adversary to breach a password protected system in a brute-force attack. In this paper, we consider memoryless string sources that generate strings consisting of independent and identically distributed characters drawn from a finite alphabet, and characterize their corresponding guesswork. Our main tool is the tilt operation on a memoryless string source. We show that the tilt operation on a memoryless string source parametrizes an exponential family of memoryless string sources, which we refer to as the tilted family of the string source. We provide an operational meaning to the tilted families by proving that two memoryless string sources result in the same guesswork on all strings of all lengths if and only if their respective categorical distributions belong to the same tilted family. Establishing some general properties of the tilt operation, we generalize the notions of weakly typical set and asymptotic equipartition property to tilted weakly typical sets of different orders. We use this new definition to characterize the large deviations for all atypical strings and characterize the volume of tilted weakly typical sets of different orders. We subsequently build on this characterization to prove large deviation bounds on guesswork and provide an accurate approximation of its probability mass function.
first_indexed 2024-09-23T10:05:32Z
format Article
id mit-1721.1/135856
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T10:05:32Z
publishDate 2021
publisher Institute of Electrical and Electronics Engineers (IEEE)
record_format dspace
spelling mit-1721.1/1358562023-12-13T21:16:49Z A Characterization of Guesswork on Swiftly Tilting Curves Beirami, Ahmad Calderbank, Robert Christiansen, Mark M Duffy, Ken R Medard, Muriel Massachusetts Institute of Technology. Research Laboratory of Electronics © 1963-2012 IEEE. Given a collection of strings, each with an associated probability of occurrence, the guesswork of each of them is their position in a list ordered from most likely to least likely, breaking ties arbitrarily. The guesswork is central to several applications in information theory: average guesswork provides a lower bound on the expected computational cost of a sequential decoder to decode successfully the transmitted message; the complementary cumulative distribution function of guesswork gives the error probability in list decoding; the logarithm of guesswork is the number of bits needed in optimal lossless one-to-one source coding; and the guesswork is the number of trials required of an adversary to breach a password protected system in a brute-force attack. In this paper, we consider memoryless string sources that generate strings consisting of independent and identically distributed characters drawn from a finite alphabet, and characterize their corresponding guesswork. Our main tool is the tilt operation on a memoryless string source. We show that the tilt operation on a memoryless string source parametrizes an exponential family of memoryless string sources, which we refer to as the tilted family of the string source. We provide an operational meaning to the tilted families by proving that two memoryless string sources result in the same guesswork on all strings of all lengths if and only if their respective categorical distributions belong to the same tilted family. Establishing some general properties of the tilt operation, we generalize the notions of weakly typical set and asymptotic equipartition property to tilted weakly typical sets of different orders. We use this new definition to characterize the large deviations for all atypical strings and characterize the volume of tilted weakly typical sets of different orders. We subsequently build on this characterization to prove large deviation bounds on guesswork and provide an accurate approximation of its probability mass function. 2021-10-27T20:29:39Z 2021-10-27T20:29:39Z 2019 2019-06-21T12:25:05Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/135856 en 10.1109/TIT.2018.2879477 IEEE Transactions on Information Theory Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) arXiv
spellingShingle Beirami, Ahmad
Calderbank, Robert
Christiansen, Mark M
Duffy, Ken R
Medard, Muriel
A Characterization of Guesswork on Swiftly Tilting Curves
title A Characterization of Guesswork on Swiftly Tilting Curves
title_full A Characterization of Guesswork on Swiftly Tilting Curves
title_fullStr A Characterization of Guesswork on Swiftly Tilting Curves
title_full_unstemmed A Characterization of Guesswork on Swiftly Tilting Curves
title_short A Characterization of Guesswork on Swiftly Tilting Curves
title_sort characterization of guesswork on swiftly tilting curves
url https://hdl.handle.net/1721.1/135856
work_keys_str_mv AT beiramiahmad acharacterizationofguessworkonswiftlytiltingcurves
AT calderbankrobert acharacterizationofguessworkonswiftlytiltingcurves
AT christiansenmarkm acharacterizationofguessworkonswiftlytiltingcurves
AT duffykenr acharacterizationofguessworkonswiftlytiltingcurves
AT medardmuriel acharacterizationofguessworkonswiftlytiltingcurves
AT beiramiahmad characterizationofguessworkonswiftlytiltingcurves
AT calderbankrobert characterizationofguessworkonswiftlytiltingcurves
AT christiansenmarkm characterizationofguessworkonswiftlytiltingcurves
AT duffykenr characterizationofguessworkonswiftlytiltingcurves
AT medardmuriel characterizationofguessworkonswiftlytiltingcurves