An Online Algorithm for Lightweight Grammar-Based Compression

Grammar-based compression is a well-studied technique to construct a context-free grammar (CFG) deriving a given text uniquely. In this work, we propose an online algorithm for grammar-based compression. Our algorithm guarantees O(log2 n)- approximation ratio for the minimum grammar size, where n is...

Full description

Bibliographic Details
Main Authors: Masayuki Takeda, Hiroshi Sakamoto, Shirou Maruyama
Format: Article
Language:English
Published: MDPI AG 2012-04-01
Series:Algorithms
Subjects:
Online Access:http://www.mdpi.com/1999-4893/5/2/214/
_version_ 1818539129042894848
author Masayuki Takeda
Hiroshi Sakamoto
Shirou Maruyama
author_facet Masayuki Takeda
Hiroshi Sakamoto
Shirou Maruyama
author_sort Masayuki Takeda
collection DOAJ
description Grammar-based compression is a well-studied technique to construct a context-free grammar (CFG) deriving a given text uniquely. In this work, we propose an online algorithm for grammar-based compression. Our algorithm guarantees O(log2 n)- approximation ratio for the minimum grammar size, where n is an input size, and it runs in input linear time and output linear space. In addition, we propose a practical encoding, which transforms a restricted CFG into a more compact representation. Experimental results by comparison with standard compressors demonstrate that our algorithm is especially effective for highly repetitive text.
first_indexed 2024-12-11T21:37:58Z
format Article
id doaj.art-d47a6ac9f2304afaac31563dc881502e
institution Directory Open Access Journal
issn 1999-4893
language English
last_indexed 2024-12-11T21:37:58Z
publishDate 2012-04-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj.art-d47a6ac9f2304afaac31563dc881502e2022-12-22T00:49:57ZengMDPI AGAlgorithms1999-48932012-04-015221423510.3390/a5020214An Online Algorithm for Lightweight Grammar-Based CompressionMasayuki TakedaHiroshi SakamotoShirou MaruyamaGrammar-based compression is a well-studied technique to construct a context-free grammar (CFG) deriving a given text uniquely. In this work, we propose an online algorithm for grammar-based compression. Our algorithm guarantees O(log2 n)- approximation ratio for the minimum grammar size, where n is an input size, and it runs in input linear time and output linear space. In addition, we propose a practical encoding, which transforms a restricted CFG into a more compact representation. Experimental results by comparison with standard compressors demonstrate that our algorithm is especially effective for highly repetitive text.http://www.mdpi.com/1999-4893/5/2/214/lossless compressiongrammar-based compressiononline algorithmapproximation algorithm
spellingShingle Masayuki Takeda
Hiroshi Sakamoto
Shirou Maruyama
An Online Algorithm for Lightweight Grammar-Based Compression
Algorithms
lossless compression
grammar-based compression
online algorithm
approximation algorithm
title An Online Algorithm for Lightweight Grammar-Based Compression
title_full An Online Algorithm for Lightweight Grammar-Based Compression
title_fullStr An Online Algorithm for Lightweight Grammar-Based Compression
title_full_unstemmed An Online Algorithm for Lightweight Grammar-Based Compression
title_short An Online Algorithm for Lightweight Grammar-Based Compression
title_sort online algorithm for lightweight grammar based compression
topic lossless compression
grammar-based compression
online algorithm
approximation algorithm
url http://www.mdpi.com/1999-4893/5/2/214/
work_keys_str_mv AT masayukitakeda anonlinealgorithmforlightweightgrammarbasedcompression
AT hiroshisakamoto anonlinealgorithmforlightweightgrammarbasedcompression
AT shiroumaruyama anonlinealgorithmforlightweightgrammarbasedcompression
AT masayukitakeda onlinealgorithmforlightweightgrammarbasedcompression
AT hiroshisakamoto onlinealgorithmforlightweightgrammarbasedcompression
AT shiroumaruyama onlinealgorithmforlightweightgrammarbasedcompression