Matching with mismatches and assorted applications

<p>This thesis consists of three parts, each of independent interest, yet tied together by the problem of matching with mismatches. In the first chapter, we present a motivated exposition of a new randomized algorithm for indexed matching with mismatches which, for constant error (substitution...

Full description

Bibliographic Details
Main Author: Percival, C
Other Authors: McColl, W
Format: Thesis
Language:English
Published: 2006
Subjects:
_version_ 1797067861646114816
author Percival, C
author2 McColl, W
author_facet McColl, W
Percival, C
author_sort Percival, C
collection OXFORD
description <p>This thesis consists of three parts, each of independent interest, yet tied together by the problem of matching with mismatches. In the first chapter, we present a motivated exposition of a new randomized algorithm for indexed matching with mismatches which, for constant error (substitution) rates, locates a substring of length m within a string of length n faster than existing algorithms by a factor of O(m/ log(n)).</p><p>The second chapter turns from this theoretical problem to an entirely practical concern: delta compression of executable code. In contrast to earlier work which has either generated very large deltas when applied to executable code, or has generated small deltas by utilizing platform and processor-specific knowledge, we present a naïve approach — that is, one which does not rely upon any external knowledge — which nevertheless constructs deltas of size comparable to those produced by a platformspecific approach. In the course of this construction, we utilize the result from the first chapter, although it is of primary utility only when producing deltas between very similar executables.</p><p> The third chapter lies between the horn and ivory gates, being both highly interesting from a theoretical viewpoint and of great practical value. Using the algorithm for matching with mismatches from the first chapter, combined with error correcting codes, we give a practical algorithm for “universal” delta compression (often called “feedback-free file synchronization”) which can operate in the presence of multiple indels and a large number of substitutions.</p>
first_indexed 2024-03-06T22:02:27Z
format Thesis
id oxford-uuid:4f0d53cc-fb9f-4246-a835-3c8734eba735
institution University of Oxford
language English
last_indexed 2024-03-06T22:02:27Z
publishDate 2006
record_format dspace
spelling oxford-uuid:4f0d53cc-fb9f-4246-a835-3c8734eba7352022-03-26T16:04:48ZMatching with mismatches and assorted applicationsThesishttp://purl.org/coar/resource_type/c_db06uuid:4f0d53cc-fb9f-4246-a835-3c8734eba735Computer science (mathematics)EnglishOxford University Research Archive - Valet2006Percival, CMcColl, WBrent, R<p>This thesis consists of three parts, each of independent interest, yet tied together by the problem of matching with mismatches. In the first chapter, we present a motivated exposition of a new randomized algorithm for indexed matching with mismatches which, for constant error (substitution) rates, locates a substring of length m within a string of length n faster than existing algorithms by a factor of O(m/ log(n)).</p><p>The second chapter turns from this theoretical problem to an entirely practical concern: delta compression of executable code. In contrast to earlier work which has either generated very large deltas when applied to executable code, or has generated small deltas by utilizing platform and processor-specific knowledge, we present a naïve approach — that is, one which does not rely upon any external knowledge — which nevertheless constructs deltas of size comparable to those produced by a platformspecific approach. In the course of this construction, we utilize the result from the first chapter, although it is of primary utility only when producing deltas between very similar executables.</p><p> The third chapter lies between the horn and ivory gates, being both highly interesting from a theoretical viewpoint and of great practical value. Using the algorithm for matching with mismatches from the first chapter, combined with error correcting codes, we give a practical algorithm for “universal” delta compression (often called “feedback-free file synchronization”) which can operate in the presence of multiple indels and a large number of substitutions.</p>
spellingShingle Computer science (mathematics)
Percival, C
Matching with mismatches and assorted applications
title Matching with mismatches and assorted applications
title_full Matching with mismatches and assorted applications
title_fullStr Matching with mismatches and assorted applications
title_full_unstemmed Matching with mismatches and assorted applications
title_short Matching with mismatches and assorted applications
title_sort matching with mismatches and assorted applications
topic Computer science (mathematics)
work_keys_str_mv AT percivalc matchingwithmismatchesandassortedapplications