Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)

The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit distance between two strings is a classical computational task,...

Full description

Bibliographic Details
Main Authors: Backurs, Arturs, Indyk, Piotr
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:en_US
Published: Association for Computing Machinery 2018
Online Access:http://hdl.handle.net/1721.1/113874
https://orcid.org/0000-0001-7546-6313
https://orcid.org/0000-0002-7983-9524
Description
Summary:The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit distance between two strings is a classical computational task, with a well-known algorithm based on dynamic programming. Unfortunately, all known algorithms for this problem run in nearly quadratic time. In this paper we provide evidence that the near-quadratic running time bounds known for the problem of computing edit distance might be {tight}. Specifically, we show that, if the edit distance can be computed in time O(n[superscript 2-δ]) for some constant δ>0, then the satisfiability of conjunctive normal form formulas with N variables and M clauses can be solved in time M[superscript O(1)] 2[superscript (1-ε)N] for a constant ε>0. The latter result would violate the Strong Exponential Time Hypothesis, which postulates that such algorithms do not exist