EverLast: A Distributed Architecture for Preserving the Web

Anand, Avishek; Bedathur, Srikanta; Berberich, Klaus; Schenkel, Ralf; Tryfonopoulos, Christos

Local TagsRelease HistoryDetailsSummary

EverLast: A Distributed Architecture for Preserving the Web

Anand, A., Bedathur, S., Berberich, K., Schenkel, R., & Tryfonopoulos, C. (2009). EverLast: A Distributed Architecture for Preserving the Web. In Proceedings of the Joint Conference on Digital Libraries (pp. 331-340). New York, NY: ACM.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/11858/00-001M-0000-000F-1910-0 Version Permalink: https://hdl.handle.net/11858/00-001M-0000-0024-3994-1

Genre: Conference Paper

Latex : {EverLast}: A Distributed Architecture for Preserving the Web

Files

show Files

Locators

show

Creators

show

hide

Creators:
Anand, Avishek¹, Author
Bedathur, Srikanta¹, Author
Berberich, Klaus¹, Author
Schenkel, Ralf¹, Author
Tryfonopoulos, Christos¹, Author

Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018

Content

show

hide

Free keywords: -

Abstract: The World Wide Web has become a key source of knowledge pertaining to almost every walk of life. Unfortunately, much of data on the Web is highly ephemeral in nature, with more than 50-80% of content estimated to be changing within a short time. Continuing the pioneering efforts of many national (digital) libraries, organizations such as the International Internet Preservation Consortium (IIPC), the Internet Archive (IA) and the European Archive (EA) have been tirelessly working towards preserving the ever changing Web. However, while these web archiving efforts have paid significant attention towards long term preservation of Web data, they have paid little attention to developing an globalscale infrastructure for collecting, archiving, and performing historical analyzes on the collected data. Based on insights from our recent work on building text analytics for Web Archives, we propose EverLast , a scalable distributed framework for next generation Web archival and temporal text analytics over the archive. Our system is built on a looselycoupled distributed architecture that can be deployed over large-scale peer-to-peer networks. In this way, we allow the integration of many archival efforts taken mainly at a national level by national digital libraries. Key features of EverLast include support of time-based text search & analysis and the use of human-assisted archive gathering. In this paper, we outline the overall architecture of EverLast, and present some promising preliminary results.

Details

show

hide

Language(s): eng - English

Dates: Published Online: 2009Date issued: 2009

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: eDoc: 520413
Other: Local-ID: C1256DBF005F876D-8109AA88947D8367C1257574003AD4DC-AnandBBST09

Degree: -

Event

show

hide

Title: 2009 Conference on Digital Libraries

Place of Event: Austin, Texas

Start-/End Date: 2009-06-15 - 2009-03-19

Legal Case

show

Project information

show

Source 1

show

hide

Title: Proceedings of the Joint Conference on Digital Libraries

Abbreviation : JCDL 2009

Source Genre: Proceedings

Creator(s):

Affiliations:

Publ. Info: New York, NY : ACM

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 331 - 340 Identifier: ISBN: 978-1-60558-322-8