Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)

The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit distance between two strings is a classical computational task,...

Full description

Bibliographic Details
Main Authors: Backurs, Arturs, Indyk, Piotr
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:en_US
Published: Association for Computing Machinery 2018
Online Access:http://hdl.handle.net/1721.1/113874
https://orcid.org/0000-0001-7546-6313
https://orcid.org/0000-0002-7983-9524
_version_ 1811084153081298944
author Backurs, Arturs
Indyk, Piotr
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Backurs, Arturs
Indyk, Piotr
author_sort Backurs, Arturs
collection MIT
description The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit distance between two strings is a classical computational task, with a well-known algorithm based on dynamic programming. Unfortunately, all known algorithms for this problem run in nearly quadratic time. In this paper we provide evidence that the near-quadratic running time bounds known for the problem of computing edit distance might be {tight}. Specifically, we show that, if the edit distance can be computed in time O(n[superscript 2-δ]) for some constant δ>0, then the satisfiability of conjunctive normal form formulas with N variables and M clauses can be solved in time M[superscript O(1)] 2[superscript (1-ε)N] for a constant ε>0. The latter result would violate the Strong Exponential Time Hypothesis, which postulates that such algorithms do not exist
first_indexed 2024-09-23T12:45:46Z
format Article
id mit-1721.1/113874
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T12:45:46Z
publishDate 2018
publisher Association for Computing Machinery
record_format dspace
spelling mit-1721.1/1138742022-10-01T10:56:37Z Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false) Backurs, Arturs Indyk, Piotr Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Backurs, Arturs Indyk, Piotr The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit distance between two strings is a classical computational task, with a well-known algorithm based on dynamic programming. Unfortunately, all known algorithms for this problem run in nearly quadratic time. In this paper we provide evidence that the near-quadratic running time bounds known for the problem of computing edit distance might be {tight}. Specifically, we show that, if the edit distance can be computed in time O(n[superscript 2-δ]) for some constant δ>0, then the satisfiability of conjunctive normal form formulas with N variables and M clauses can be solved in time M[superscript O(1)] 2[superscript (1-ε)N] for a constant ε>0. The latter result would violate the Strong Exponential Time Hypothesis, which postulates that such algorithms do not exist National Science Foundation (U.S.) IBM (PhD Felllowship) Center for Massive Data Algorithmics (MADALGO) Simons Foundation (Investigator Award) 2018-02-22T19:57:10Z 2018-02-22T19:57:10Z 2015-06 Article http://purl.org/eprint/type/ConferencePaper 978-1-4503-3536-2 http://hdl.handle.net/1721.1/113874 Backurs, Arturs, and Piotr Indyk. “Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH Is False).” Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing - STOC ’15 (2015), 14-17 June, 2015, Portland, Oregon, Association of Computing Machinery, 2015, pp. 51-58. https://orcid.org/0000-0001-7546-6313 https://orcid.org/0000-0002-7983-9524 en_US http://dx.doi.org/10.1145/2746539.2746612 Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing - STOC '15 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery arXiv
spellingShingle Backurs, Arturs
Indyk, Piotr
Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)
title Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)
title_full Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)
title_fullStr Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)
title_full_unstemmed Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)
title_short Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)
title_sort edit distance cannot be computed in strongly subquadratic time unless seth is false
url http://hdl.handle.net/1721.1/113874
https://orcid.org/0000-0001-7546-6313
https://orcid.org/0000-0002-7983-9524
work_keys_str_mv AT backursarturs editdistancecannotbecomputedinstronglysubquadratictimeunlesssethisfalse
AT indykpiotr editdistancecannotbecomputedinstronglysubquadratictimeunlesssethisfalse