Approximation to the mean curve in the LCS problem

The problem of sequence comparison via optimal alignments occurs naturally in many areas of applications. The simplest such technique is based on evaluating a score given by the length of a longest common subsequence divided by the average length of the original sequences. In this paper we investiga...

Full description

Bibliographic Details
Main Authors: Durringer, C, Hauser, R, Matzinger, H
Format: Report
Published: Unspecified 2006
_version_ 1826257884274491392
author Durringer, C
Hauser, R
Matzinger, H
author_facet Durringer, C
Hauser, R
Matzinger, H
author_sort Durringer, C
collection OXFORD
description The problem of sequence comparison via optimal alignments occurs naturally in many areas of applications. The simplest such technique is based on evaluating a score given by the length of a longest common subsequence divided by the average length of the original sequences. In this paper we investigate the expected value of this score when the input sequences are random and their length tends to infinity. The corresponding limit exists but is not known precisely. We derive a large-deviation, convex analysis and Montecarlo based method to compute a consistent sequence of upper bounds on the unknown limit. Raphael Hauser was supported through grant NAL/00720/G from the Nuffield Foundation and through grant GR/M30975 from the Engineering and Physical Sciences Research Council of the UK. All three authors also acknowledge the generous support through the Sonderforschungsbereich Grant SFB 701 A3 from the German Research Foundation.
first_indexed 2024-03-06T18:25:14Z
format Report
id oxford-uuid:07b98b3f-6948-4014-9dc3-9552fd050896
institution University of Oxford
last_indexed 2024-03-06T18:25:14Z
publishDate 2006
publisher Unspecified
record_format dspace
spelling oxford-uuid:07b98b3f-6948-4014-9dc3-9552fd0508962022-03-26T09:09:06ZApproximation to the mean curve in the LCS problemReporthttp://purl.org/coar/resource_type/c_93fcuuid:07b98b3f-6948-4014-9dc3-9552fd050896Mathematical Institute - ePrintsUnspecified2006Durringer, CHauser, RMatzinger, HThe problem of sequence comparison via optimal alignments occurs naturally in many areas of applications. The simplest such technique is based on evaluating a score given by the length of a longest common subsequence divided by the average length of the original sequences. In this paper we investigate the expected value of this score when the input sequences are random and their length tends to infinity. The corresponding limit exists but is not known precisely. We derive a large-deviation, convex analysis and Montecarlo based method to compute a consistent sequence of upper bounds on the unknown limit. Raphael Hauser was supported through grant NAL/00720/G from the Nuffield Foundation and through grant GR/M30975 from the Engineering and Physical Sciences Research Council of the UK. All three authors also acknowledge the generous support through the Sonderforschungsbereich Grant SFB 701 A3 from the German Research Foundation.
spellingShingle Durringer, C
Hauser, R
Matzinger, H
Approximation to the mean curve in the LCS problem
title Approximation to the mean curve in the LCS problem
title_full Approximation to the mean curve in the LCS problem
title_fullStr Approximation to the mean curve in the LCS problem
title_full_unstemmed Approximation to the mean curve in the LCS problem
title_short Approximation to the mean curve in the LCS problem
title_sort approximation to the mean curve in the lcs problem
work_keys_str_mv AT durringerc approximationtothemeancurveinthelcsproblem
AT hauserr approximationtothemeancurveinthelcsproblem
AT matzingerh approximationtothemeancurveinthelcsproblem