Compressed Matching in Dictionaries

The problem of compressed pattern matching, which has recently been treated in many papers dealing with free text, is extended to structured files, specifically to dictionaries, which appear in any full-text retrieval system. The prefix-omission method is combined with Huffman coding and a new varia...

Popoln opis

Bibliografske podrobnosti
Main Authors: Shmuel T. Klein, Dana Shapira
Format: Article
Jezik:English
Izdano: MDPI AG 2011-03-01
Serija:Algorithms
Teme:
Online dostop:http://www.mdpi.com/1999-4893/4/1/61/
_version_ 1828811697396121600
author Shmuel T. Klein
Dana Shapira
author_facet Shmuel T. Klein
Dana Shapira
author_sort Shmuel T. Klein
collection DOAJ
description The problem of compressed pattern matching, which has recently been treated in many papers dealing with free text, is extended to structured files, specifically to dictionaries, which appear in any full-text retrieval system. The prefix-omission method is combined with Huffman coding and a new variant based on Fibonacci codes is presented. Experimental results suggest that the new methods are often preferable to earlier ones, in particular for small files which are typical for dictionaries, since these are usually kept in small chunks.
first_indexed 2024-12-12T09:32:25Z
format Article
id doaj.art-f8eaf52d4a4c43a4a2863bd73e60a5ec
institution Directory Open Access Journal
issn 1999-4893
language English
last_indexed 2024-12-12T09:32:25Z
publishDate 2011-03-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj.art-f8eaf52d4a4c43a4a2863bd73e60a5ec2022-12-22T00:28:50ZengMDPI AGAlgorithms1999-48932011-03-0141617410.3390/a4010061Compressed Matching in DictionariesShmuel T. KleinDana ShapiraThe problem of compressed pattern matching, which has recently been treated in many papers dealing with free text, is extended to structured files, specifically to dictionaries, which appear in any full-text retrieval system. The prefix-omission method is combined with Huffman coding and a new variant based on Fibonacci codes is presented. Experimental results suggest that the new methods are often preferable to earlier ones, in particular for small files which are typical for dictionaries, since these are usually kept in small chunks.http://www.mdpi.com/1999-4893/4/1/61/dictionariesIR systemspattern matchingcompressed matchingHuffman codesFibonacci codes
spellingShingle Shmuel T. Klein
Dana Shapira
Compressed Matching in Dictionaries
Algorithms
dictionaries
IR systems
pattern matching
compressed matching
Huffman codes
Fibonacci codes
title Compressed Matching in Dictionaries
title_full Compressed Matching in Dictionaries
title_fullStr Compressed Matching in Dictionaries
title_full_unstemmed Compressed Matching in Dictionaries
title_short Compressed Matching in Dictionaries
title_sort compressed matching in dictionaries
topic dictionaries
IR systems
pattern matching
compressed matching
Huffman codes
Fibonacci codes
url http://www.mdpi.com/1999-4893/4/1/61/
work_keys_str_mv AT shmueltklein compressedmatchingindictionaries
AT danashapira compressedmatchingindictionaries