hide
Free keywords:
Computer Science, Distributed, Parallel, and Cluster Computing, cs.DC
Abstract:
We study parallel algorithms for the classical balls-into-bins problem, in
which $m$ balls acting in parallel as separate agents are placed into $n$ bins.
Algorithms operate in synchronous rounds, in each of which balls and bins
exchange messages once. The goal is to minimize the maximal load over all bins
using a small number of rounds and few messages.
While the case of $m=n$ balls has been extensively studied, little is known
about the heavily loaded case. In this work, we consider parallel algorithms
for this somewhat neglected regime of $m\gg n$. The naive solution of
allocating each ball to a bin chosen uniformly and independently at random
results in maximal load $m/n+\Theta(\sqrt{m/n\cdot \log n})$ (for $m\geq n \log
n$) w.h.p. In contrast, for the sequential setting Berenbrink et al (SIAM J.
Comput 2006) showed that letting each ball join the least loaded bin of two
randomly selected bins reduces the maximal load to $m/n+O(\log\log m)$ w.h.p.
To date, no parallel variant of such a result is known.
We present a simple parallel threshold algorithm that obtains a maximal load
of $m/n+O(1)$ w.h.p. within $O(\log\log (m/n)+\log^* n)$ rounds. The algorithm
is symmetric (balls and bins all "look the same"), and balls send $O(1)$
messages in expectation per round. The additive term of $O(\log^* n)$ in the
complexity is known to be tight for such algorithms (Lenzen and Wattenhofer
Distributed Computing 2016). We also prove that our analysis is tight, i.e.,
algorithms of the type we provide must run for $\Omega(\min\{\log\log
(m/n),n\})$ rounds w.h.p.
Finally, we give a simple asymmetric algorithm (i.e., balls are aware of a
common labeling of the bins) that achieves a maximal load of $m/n + O(1)$ in a
constant number of rounds w.h.p. Again, balls send only a single message per
round, and bins receive $(1+o(1))m/n+O(\log n)$ messages w.h.p.