Fine-Grained Complexity of Analyzing Compressed Data: Quantifying Improvements 
over Decompress-And-Solve

Abboud, Amir; Backurs, Arturs; Bringmann, Karl; Künnemann, Marvin

DetailsSummary

Fine-Grained Complexity of Analyzing Compressed Data: Quantifying Improvements over Decompress-And-Solve

Abboud, A., Backurs, A., Bringmann, K., & Künnemann, M. (2018). Fine-Grained Complexity of Analyzing Compressed Data: Quantifying Improvements over Decompress-And-Solve. Retrieved from http://arxiv.org/abs/1803.00796.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0001-3E38-C Version Permalink: https://hdl.handle.net/21.11116/0000-0001-3E39-B

Genre: Paper

Files

show Files

hide Files

:

arXiv:1803.00796.pdf (Preprint), 931KB

View Save

File Permalink:
https://hdl.handle.net/21.11116/0000-0001-3E3A-A

Name:
arXiv:1803.00796.pdf

Description:
File downloaded from arXiv at 2018-05-03 10:43 Presented at FOCS'17. Full version. 63 pages

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://arxiv.org/help/license

Locators

show

Creators

show

hide

Creators:
Abboud, Amir¹, Author
Backurs, Arturs¹, Author
Bringmann, Karl², Author
Künnemann, Marvin², Author

Affiliations:
1External Organizations, ou_persistent22
2Algorithms and Complexity, MPI for Informatics, Max Planck Society, ou_24019

Content

show

hide

Free keywords: Computer Science, Computational Complexity, cs.CC,Computer Science, Data Structures and Algorithms, cs.DS

Abstract: Can we analyze data without decompressing it? As our data keeps growing, understanding the time complexity of problems on compressed inputs, rather than in convenient uncompressed forms, becomes more and more relevant. Suppose we are given a compression of size $n$ of data that originally has size $N$, and we want to solve a problem with time complexity $T(\cdot)$. The naive strategy of "decompress-and-solve" gives time $T(N)$, whereas "the gold standard" is time $T(n)$: to analyze the compression as efficiently as if the original data was small. We restrict our attention to data in the form of a string (text, files, genomes, etc.) and study the most ubiquitous tasks. While the challenge might seem to depend heavily on the specific compression scheme, most methods of practical relevance (Lempel-Ziv-family, dictionary methods, and others) can be unified under the elegant notion of Grammar Compressions. A vast literature, across many disciplines, established this as an influential notion for Algorithm design. We introduce a framework for proving (conditional) lower bounds in this field, allowing us to assess whether decompress-and-solve can be improved, and by how much. Our main results are: - The $O(nN\sqrt{\log{N/n}})$ bound for LCS and the $O(\min\{N \log N, nM\})$ bound for Pattern Matching with Wildcards are optimal up to $N^{o(1)}$ factors, under the Strong Exponential Time Hypothesis. (Here, $M$ denotes the uncompressed length of the compressed pattern.) - Decompress-and-solve is essentially optimal for Context-Free Grammar Parsing and RNA Folding, under the $k$-Clique conjecture. - We give an algorithm showing that decompress-and-solve is not optimal for Disjointness.

Details

show

hide

Language(s): eng - English

Dates: Created: 2018-03-02Published Online: 2018

Publication Status: Published online

Pages: 63 p.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: arXiv: 1803.00796
URI: http://arxiv.org/abs/1803.00796
BibTex Citekey: Abboud_arXiv1803.00796

Degree: -

Event

show

Legal Case

show

Project information

show

Source

show