Approximation algorithms for grammar-based data compression

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.

Bibliographic Details
Main Author: Lehman, Eric (Eric Allen), 1970-
Other Authors: Madhu Sudan.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2014
Subjects:
Online Access:http://hdl.handle.net/1721.1/87172
_version_ 1826217654117990400
author Lehman, Eric (Eric Allen), 1970-
author2 Madhu Sudan.
author_facet Madhu Sudan.
Lehman, Eric (Eric Allen), 1970-
author_sort Lehman, Eric (Eric Allen), 1970-
collection MIT
description Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.
first_indexed 2024-09-23T17:07:08Z
format Thesis
id mit-1721.1/87172
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T17:07:08Z
publishDate 2014
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/871722019-04-12T07:17:10Z Approximation algorithms for grammar-based data compression Lehman, Eric (Eric Allen), 1970- Madhu Sudan. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. Includes bibliographical references (p. 109-113). This thesis considers the smallest grammar problem: find the smallest context-free grammar that generates exactly one given string. We show that this problem is intractable, and so our objective is to find approximation algorithms. This simple question is connected to many areas of research. Most importantly, there is a link to data compression; instead of storing a long string, one can store a small grammar that generates it. A small grammar for a string also naturally brings out underlying patterns, a fact that is useful, for example, in DNA analysis. Moreover, the size of the smallest context-free grammar generating a string can be regarded as a computable relaxation of Kolmogorov complexity. Finally, work on the smallest grammar problem qualitatively extends the study of approximation algorithms to hierarchically-structured objects. In this thesis, we establish hardness results, evaluate several previously proposed algorithms, and then present new procedures with much stronger approximation guarantees. by Eric Lehman. Ph.D. 2014-05-23T19:03:45Z 2014-05-23T19:03:45Z 2002 2002 Thesis http://hdl.handle.net/1721.1/87172 50504311 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 113 p. application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Lehman, Eric (Eric Allen), 1970-
Approximation algorithms for grammar-based data compression
title Approximation algorithms for grammar-based data compression
title_full Approximation algorithms for grammar-based data compression
title_fullStr Approximation algorithms for grammar-based data compression
title_full_unstemmed Approximation algorithms for grammar-based data compression
title_short Approximation algorithms for grammar-based data compression
title_sort approximation algorithms for grammar based data compression
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/87172
work_keys_str_mv AT lehmanericericallen1970 approximationalgorithmsforgrammarbaseddatacompression