Efficient Replica Maintenance for Distributed Storage Systems

Chun, Byung-Gon; Dabek, Frank; Haeberlen, Andreas; Sit, Emil; Weatherspoon, Hakim; Kaashoek, M. Frans; Kubiatowicz, John; Morris, Robert

Local TagsRelease HistoryDetailsSummary

Efficient Replica Maintenance for Distributed Storage Systems

Chun, B.-G., Dabek, F., Haeberlen, A., Sit, E., Weatherspoon, H., Kaashoek, M. F., et al. (2006). Efficient Replica Maintenance for Distributed Storage Systems. In 3rd Symposium on Networked Systems Design & Implementation (pp. 45-58). Berkeley, CA: USENIX.

Item is Released

show all

Basic

hide

Item Permalink: https://hdl.handle.net/11858/00-001M-0000-0028-8C80-0 Version Permalink: https://hdl.handle.net/11858/00-001M-0000-0028-8C81-E

Genre: Conference Paper

Files

hide Files

:

carbonite.pdf (Any fulltext), 377KB

File Permalink:
-

Name:
carbonite.pdf

Description:
-

OA-Status:

Visibility:
Private

MIME-Type / Checksum:
application/pdf

Technical Metadata:

Copyright Date:
-

Copyright Info:
eDoc_access: PUBLIC

License:
-

Locators

show

Creators

hide

Creators:
Chun, Byung-Gon, Author
Dabek, Frank, Author
Haeberlen, Andreas¹, Author
Sit, Emil, Author
Weatherspoon, Hakim, Author
Kaashoek, M. Frans, Author
Kubiatowicz, John, Author
Morris, Robert, Author

Affiliations:
1Group P. Druschel, Max Planck Institute for Software Systems, Max Planck Society, ou_2105287

Content

hide

Free keywords: -

Abstract: This paper considers replication strategies for storage systems that aggregate the disks of many nodes spread over the Internet. Maintaining replication in such systems can be prohibitively expensive, since every transient network or host failure could potentially lead to copying a server's worth of data over the Internet to maintain replication levels. The following insights in designing an efficient replication algorithm emerge from the paper's analysis. First, durability can be provided separately from availability; the former is less expensive to ensure and a more useful goal for many wide-area applications. Second, the focus of a durability algorithm must be to create new copies of data objects faster than permanent disk failures destroy the objects; careful choice of policies for what nodes should hold what data can decrease repair time. Third, increasing the number of replicas of each data object does not help a system tolerate a higher disk failure probability, but does help tolerate bursts of failures. Finally, ensuring that the system makes use of replicas that recover after temporary failure is critical to efficiency. Based on these insights, the paper proposes the Carbonite replication algorithm for keeping data durable at a low cost. A simulation of Carbonite storing 1 TB of data over a 365 day trace of PlanetLab activity shows that Carbonite is able to keep all data durable and uses 44% more network traffic than a hypothetical system that only responds to permanent failures. In comparison, Total Recall and DHash require almost a factor of two more network traffic than this hypothetical system.

Details

hide

Language(s): eng - English

Dates: Modified: 2007-04-13Date issued: 2006

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: eDoc: 314357
Other: C125718C00511B58-B42A00EBE0AB7E2AC12572BB0046296C-chun-2006-maintenance

Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

hide

Title: 3rd Symposium on Networked Systems Design & Implementation

Source Genre: Proceedings

Creator(s):

Affiliations:

Publ. Info: Berkeley, CA : USENIX

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 45 - 58 Identifier: ISBN: 1-931971-43-9