LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS
The sliding window dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are widely used for universal lossless data compression. The encoding component of these algorithms performs repeated substring search. Data structures, such as hash tables, binary search trees, and suffix trees have bee...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Instituto Superior de Engenharia de Lisboa (ISEL)
2013-06-01
|
Series: | ISEL Academic Journal of Electronics, Telecommunications and Computers |
Subjects: | |
Online Access: | http://journals.isel.pt/index.php/i-ETC/article/view/6 |
_version_ | 1819238738544295936 |
---|---|
author | Artur Ferreira Arlindo Oliveira Mario Figueiredo |
author_facet | Artur Ferreira Arlindo Oliveira Mario Figueiredo |
author_sort | Artur Ferreira |
collection | DOAJ |
description | The sliding window dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are widely used for universal lossless data compression. The encoding component of these algorithms performs repeated substring search. Data structures, such as hash tables, binary search trees, and suffix trees have been used to speedup these searches, at the expense of memory usage. Previous work has shown how suffix arrays (SA) can be used for dictionary representation and LZ77 decomposition. In this paper, we improve over that work by proposing a new efficient algorithm to update the sliding window each time a token is produced at the output. The proposed algorithm toggles between two SA on consecutive tokens. The resulting SA-based encoder requires less memory than the conventional tree-based encoders. In comparing our SA-based technique against tree-based encoders, on a large set of benchmark files, we find that, in some compression settings, our encoder is also faster than tree-based encoders. |
first_indexed | 2024-12-23T13:41:00Z |
format | Article |
id | doaj.art-9ca6c1cce13e4a318486b5931d64bac2 |
institution | Directory Open Access Journal |
issn | 2182-4010 |
language | English |
last_indexed | 2024-12-23T13:41:00Z |
publishDate | 2013-06-01 |
publisher | Instituto Superior de Engenharia de Lisboa (ISEL) |
record_format | Article |
series | ISEL Academic Journal of Electronics, Telecommunications and Computers |
spelling | doaj.art-9ca6c1cce13e4a318486b5931d64bac22022-12-21T17:44:52ZengInstituto Superior de Engenharia de Lisboa (ISEL)ISEL Academic Journal of Electronics, Telecommunications and Computers2182-40102013-06-01216LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYSArtur FerreiraArlindo OliveiraMario FigueiredoThe sliding window dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are widely used for universal lossless data compression. The encoding component of these algorithms performs repeated substring search. Data structures, such as hash tables, binary search trees, and suffix trees have been used to speedup these searches, at the expense of memory usage. Previous work has shown how suffix arrays (SA) can be used for dictionary representation and LZ77 decomposition. In this paper, we improve over that work by proposing a new efficient algorithm to update the sliding window each time a token is produced at the output. The proposed algorithm toggles between two SA on consecutive tokens. The resulting SA-based encoder requires less memory than the conventional tree-based encoders. In comparing our SA-based technique against tree-based encoders, on a large set of benchmark files, we find that, in some compression settings, our encoder is also faster than tree-based encoders.http://journals.isel.pt/index.php/i-ETC/article/view/6Lempel-Ziv compressionsuffix arrayssliding window updatesubstring search |
spellingShingle | Artur Ferreira Arlindo Oliveira Mario Figueiredo LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS ISEL Academic Journal of Electronics, Telecommunications and Computers Lempel-Ziv compression suffix arrays sliding window update substring search |
title | LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS |
title_full | LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS |
title_fullStr | LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS |
title_full_unstemmed | LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS |
title_short | LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS |
title_sort | lempel ziv sliding window update with suffix arrays |
topic | Lempel-Ziv compression suffix arrays sliding window update substring search |
url | http://journals.isel.pt/index.php/i-ETC/article/view/6 |
work_keys_str_mv | AT arturferreira lempelzivslidingwindowupdatewithsuffixarrays AT arlindooliveira lempelzivslidingwindowupdatewithsuffixarrays AT mariofigueiredo lempelzivslidingwindowupdatewithsuffixarrays |