hide
Free keywords:
Computer Science, Data Structures and Algorithms, cs.DS
Abstract:
This paper studies the \emph{$\varepsilon$-approximate range emptiness}
problem, where the task is to represent a set $S$ of $n$ points from
$\{0,\ldots,U-1\}$ and answer emptiness queries of the form "$[a ; b]\cap S
\neq \emptyset$ ?" with a probability of \emph{false positives} allowed. This
generalizes the functionality of \emph{Bloom filters} from single point queries
to any interval length $L$. Setting the false positive rate to $\varepsilon/L$
and performing $L$ queries, Bloom filters yield a solution to this problem with
space $O(n \lg(L/\varepsilon))$ bits, false positive probability bounded by
$\varepsilon$ for intervals of length up to $L$, using query time $O(L
\lg(L/\varepsilon))$. Our first contribution is to show that the space/error
trade-off cannot be improved asymptotically: Any data structure for answering
approximate range emptiness queries on intervals of length up to $L$ with false
positive probability $\varepsilon$, must use space $\Omega(n
\lg(L/\varepsilon)) - O(n)$ bits. On the positive side we show that the query
time can be improved greatly, to constant time, while matching our space lower
bound up to a lower order additive term. This result is achieved through a
succinct data structure for (non-approximate 1d) range emptiness/reporting
queries, which may be of independent interest.