# Item

ITEM ACTIONSEXPORT

Released

Conference Paper

#### Online Checkpointing with Improved Worst-case Guarantees

##### External Resource

https://rdcu.be/dx1Pb

(Publisher version)

##### Fulltext (restricted access)

There are currently no full texts shared for your IP range.

##### Fulltext (public)

There are no public fulltexts stored in PuRe

##### Supplementary Material (public)

There is no public supplementary material available

##### Citation

Bringmann, K., Doerr, B., Neumann, A., & Sliacan, J. (2013). Online Checkpointing
with Improved Worst-case Guarantees. In F. V. Fomin, R. Freivalds, M. Kwiatkowska, & D. Peleg (*Automata, Languages, and Programming* (pp. 255-266). Berlin: Springer. doi:10.1007/978-3-642-39206-1_22.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0015-3B60-F

##### Abstract

In the online checkpointing problem, the task is to continuously maintain a

set of k checkpoints that allow to rewind an ongoing computation faster than

by a full restart. The only operation allowed is to replace an old checkpoint

by the current state. Our aim are checkpoint placement strategies that minimize

rewinding cost, i.e., such that at all times T when requested to rewind to

some time t ≤ T the number of computation steps that need to be redone to

get to t from a checkpoint before t is as small as possible. In particular,

we want that the closest checkpoint earlier than t is not further away from

t than q_k times the ideal distance T / (k+1), where q_k is a small

constant.

Improving over earlier work showing 1 + 1/k ≤ q_k ≤ 2, we show that

q_k can be chosen asymptotically less than 2. We present algorithms with

asymptotic discrepancy q_k ≤ 1.59 + o(1) valid for all k and q_k ≤

\ln(4) + o(1) ≤ 1.39 + o(1) valid for k being a power of two. Experiments

indicate the uniform bound p_k ≤ 1.7 for all k. For small k, we show

how to use a linear programming approach to compute good checkpointing

algorithms. This gives discrepancies of less than 1.55 for all k < 60.

We prove the first lower bound that is asymptotically more than one, namely

q_k ≥ 1.30 - o(1). We also show that optimal algorithms (yielding the

infimum discrepancy) exist for all~k.

set of k checkpoints that allow to rewind an ongoing computation faster than

by a full restart. The only operation allowed is to replace an old checkpoint

by the current state. Our aim are checkpoint placement strategies that minimize

rewinding cost, i.e., such that at all times T when requested to rewind to

some time t ≤ T the number of computation steps that need to be redone to

get to t from a checkpoint before t is as small as possible. In particular,

we want that the closest checkpoint earlier than t is not further away from

t than q_k times the ideal distance T / (k+1), where q_k is a small

constant.

Improving over earlier work showing 1 + 1/k ≤ q_k ≤ 2, we show that

q_k can be chosen asymptotically less than 2. We present algorithms with

asymptotic discrepancy q_k ≤ 1.59 + o(1) valid for all k and q_k ≤

\ln(4) + o(1) ≤ 1.39 + o(1) valid for k being a power of two. Experiments

indicate the uniform bound p_k ≤ 1.7 for all k. For small k, we show

how to use a linear programming approach to compute good checkpointing

algorithms. This gives discrepancies of less than 1.55 for all k < 60.

We prove the first lower bound that is asymptotically more than one, namely

q_k ≥ 1.30 - o(1). We also show that optimal algorithms (yielding the

infimum discrepancy) exist for all~k.