Approximation algorithms for grammar-based data compression
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | eng |
Published: |
Massachusetts Institute of Technology
2014
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/87172 |
_version_ | 1826217654117990400 |
---|---|
author | Lehman, Eric (Eric Allen), 1970- |
author2 | Madhu Sudan. |
author_facet | Madhu Sudan. Lehman, Eric (Eric Allen), 1970- |
author_sort | Lehman, Eric (Eric Allen), 1970- |
collection | MIT |
description | Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. |
first_indexed | 2024-09-23T17:07:08Z |
format | Thesis |
id | mit-1721.1/87172 |
institution | Massachusetts Institute of Technology |
language | eng |
last_indexed | 2024-09-23T17:07:08Z |
publishDate | 2014 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/871722019-04-12T07:17:10Z Approximation algorithms for grammar-based data compression Lehman, Eric (Eric Allen), 1970- Madhu Sudan. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. Includes bibliographical references (p. 109-113). This thesis considers the smallest grammar problem: find the smallest context-free grammar that generates exactly one given string. We show that this problem is intractable, and so our objective is to find approximation algorithms. This simple question is connected to many areas of research. Most importantly, there is a link to data compression; instead of storing a long string, one can store a small grammar that generates it. A small grammar for a string also naturally brings out underlying patterns, a fact that is useful, for example, in DNA analysis. Moreover, the size of the smallest context-free grammar generating a string can be regarded as a computable relaxation of Kolmogorov complexity. Finally, work on the smallest grammar problem qualitatively extends the study of approximation algorithms to hierarchically-structured objects. In this thesis, we establish hardness results, evaluate several previously proposed algorithms, and then present new procedures with much stronger approximation guarantees. by Eric Lehman. Ph.D. 2014-05-23T19:03:45Z 2014-05-23T19:03:45Z 2002 2002 Thesis http://hdl.handle.net/1721.1/87172 50504311 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 113 p. application/pdf Massachusetts Institute of Technology |
spellingShingle | Electrical Engineering and Computer Science. Lehman, Eric (Eric Allen), 1970- Approximation algorithms for grammar-based data compression |
title | Approximation algorithms for grammar-based data compression |
title_full | Approximation algorithms for grammar-based data compression |
title_fullStr | Approximation algorithms for grammar-based data compression |
title_full_unstemmed | Approximation algorithms for grammar-based data compression |
title_short | Approximation algorithms for grammar-based data compression |
title_sort | approximation algorithms for grammar based data compression |
topic | Electrical Engineering and Computer Science. |
url | http://hdl.handle.net/1721.1/87172 |
work_keys_str_mv | AT lehmanericericallen1970 approximationalgorithmsforgrammarbaseddatacompression |