Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)
The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit distance between two strings is a classical computational task,...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Association for Computing Machinery
2018
|
Online Access: | http://hdl.handle.net/1721.1/113874 https://orcid.org/0000-0001-7546-6313 https://orcid.org/0000-0002-7983-9524 |
_version_ | 1811084153081298944 |
---|---|
author | Backurs, Arturs Indyk, Piotr |
author2 | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
author_facet | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Backurs, Arturs Indyk, Piotr |
author_sort | Backurs, Arturs |
collection | MIT |
description | The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit distance between two strings is a classical computational task, with a well-known algorithm based on dynamic programming. Unfortunately, all known algorithms for this problem run in nearly quadratic time.
In this paper we provide evidence that the near-quadratic running time bounds known for the problem of computing edit distance might be {tight}. Specifically, we show that, if the edit distance can be computed in time O(n[superscript 2-δ]) for some constant δ>0, then the satisfiability of conjunctive normal form formulas with N variables and M clauses can be solved in time M[superscript O(1)] 2[superscript (1-ε)N] for a constant ε>0. The latter result would violate the Strong Exponential Time Hypothesis, which postulates that such algorithms do not exist |
first_indexed | 2024-09-23T12:45:46Z |
format | Article |
id | mit-1721.1/113874 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T12:45:46Z |
publishDate | 2018 |
publisher | Association for Computing Machinery |
record_format | dspace |
spelling | mit-1721.1/1138742022-10-01T10:56:37Z Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false) Backurs, Arturs Indyk, Piotr Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Backurs, Arturs Indyk, Piotr The edit distance (a.k.a. the Levenshtein distance) between two strings is defined as the minimum number of insertions, deletions or substitutions of symbols needed to transform one string into another. The problem of computing the edit distance between two strings is a classical computational task, with a well-known algorithm based on dynamic programming. Unfortunately, all known algorithms for this problem run in nearly quadratic time. In this paper we provide evidence that the near-quadratic running time bounds known for the problem of computing edit distance might be {tight}. Specifically, we show that, if the edit distance can be computed in time O(n[superscript 2-δ]) for some constant δ>0, then the satisfiability of conjunctive normal form formulas with N variables and M clauses can be solved in time M[superscript O(1)] 2[superscript (1-ε)N] for a constant ε>0. The latter result would violate the Strong Exponential Time Hypothesis, which postulates that such algorithms do not exist National Science Foundation (U.S.) IBM (PhD Felllowship) Center for Massive Data Algorithmics (MADALGO) Simons Foundation (Investigator Award) 2018-02-22T19:57:10Z 2018-02-22T19:57:10Z 2015-06 Article http://purl.org/eprint/type/ConferencePaper 978-1-4503-3536-2 http://hdl.handle.net/1721.1/113874 Backurs, Arturs, and Piotr Indyk. “Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH Is False).” Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing - STOC ’15 (2015), 14-17 June, 2015, Portland, Oregon, Association of Computing Machinery, 2015, pp. 51-58. https://orcid.org/0000-0001-7546-6313 https://orcid.org/0000-0002-7983-9524 en_US http://dx.doi.org/10.1145/2746539.2746612 Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing - STOC '15 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery arXiv |
spellingShingle | Backurs, Arturs Indyk, Piotr Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false) |
title | Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false) |
title_full | Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false) |
title_fullStr | Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false) |
title_full_unstemmed | Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false) |
title_short | Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false) |
title_sort | edit distance cannot be computed in strongly subquadratic time unless seth is false |
url | http://hdl.handle.net/1721.1/113874 https://orcid.org/0000-0001-7546-6313 https://orcid.org/0000-0002-7983-9524 |
work_keys_str_mv | AT backursarturs editdistancecannotbecomputedinstronglysubquadratictimeunlesssethisfalse AT indykpiotr editdistancecannotbecomputedinstronglysubquadratictimeunlesssethisfalse |