English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Faster Pattern Matching under Edit Distance

Charalampopoulos, P., Kociumaka, T., & Wellnitz, P. (2022). Faster Pattern Matching under Edit Distance. Retrieved from https://arxiv.org/abs/2204.03087.

Item is

Files

show Files
hide Files
:
arXiv:2204.03087.pdf (Preprint), 2MB
Name:
arXiv:2204.03087.pdf
Description:
File downloaded from arXiv at 2023-01-04 10:46
OA-Status:
Green
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-

Locators

show

Creators

show
hide
 Creators:
Charalampopoulos, Panagiotis1, Author
Kociumaka, Tomasz2, Author           
Wellnitz, Philip2, Author                 
Affiliations:
1External Organizations, ou_persistent22              
2Algorithms and Complexity, MPI for Informatics, Max Planck Society, ou_24019              

Content

show
hide
Free keywords: Computer Science, Data Structures and Algorithms, cs.DS
 Abstract: We consider the approximate pattern matching problem under the edit distance.
Given a text $T$ of length $n$, a pattern $P$ of length $m$, and a threshold
$k$, the task is to find the starting positions of all substrings of $T$ that
can be transformed to $P$ with at most $k$ edits. More than 20 years ago, Cole
and Hariharan [SODA'98, J. Comput.'02] gave an $\mathcal{O}(n+k^4 \cdot n/
m)$-time algorithm for this classic problem, and this runtime has not been
improved since.
Here, we present an algorithm that runs in time $\mathcal{O}(n+k^{3.5}
\sqrt{\log m \log k} \cdot n/m)$, thus breaking through this long-standing
barrier. In the case where $n^{1/4+\varepsilon} \leq k \leq
n^{2/5-\varepsilon}$ for some arbitrarily small positive constant
$\varepsilon$, our algorithm improves over the state-of-the-art by polynomial
factors: it is polynomially faster than both the algorithm of Cole and
Hariharan and the classic $\mathcal{O}(kn)$-time algorithm of Landau and
Vishkin [STOC'86, J. Algorithms'89].
We observe that the bottleneck case of the alternative $\mathcal{O}(n+k^4
\cdot n/m)$-time algorithm of Charalampopoulos, Kociumaka, and Wellnitz
[FOCS'20] is when the text and the pattern are (almost) periodic. Our new
algorithm reduces this case to a new dynamic problem (Dynamic Puzzle Matching),
which we solve by building on tools developed by Tiskin [SODA'10,
Algorithmica'15] for the so-called seaweed monoid of permutation matrices. Our
algorithm relies only on a small set of primitive operations on strings and
thus also applies to the fully-compressed setting (where text and pattern are
given as straight-line programs) and to the dynamic setting (where we maintain
a collection of strings under creation, splitting, and concatenation),
improving over the state of the art.

Details

show
hide
Language(s): eng - English
 Dates: 2022-04-062022
 Publication Status: Published online
 Pages: 94 p.
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: arXiv: 2204.03087
BibTex Citekey: Charalampopoulos2204.03087
URI: https://arxiv.org/abs/2204.03087
 Degree: -

Event

show

Legal Case

show

Project information

show

Source

show