English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Temporal Search in Web Archives

Berberich, K. (2010). Temporal Search in Web Archives. PhD Thesis, Universität des Saarlandes, Saarbrücken. Retrieved from http://scidok.sulb.uni-saarland.de/volltexte/2010/3281/.

Item is

Files

show Files
hide Files
:
phd-thesis-final.pdf (Any fulltext), 4MB
Name:
phd-thesis-final.pdf
Description:
-
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show
hide
Description:
-
Locator:
http://scidok.sulb.uni-saarland.de/doku/lic_ohne_pod.php?la=de (Copyright transfer agreement)
Description:
-

Creators

show
hide
 Creators:
Berberich, Klaus1, 2, Author              
Weikum, Gerhard1, Advisor              
Seeger, Bernhard3, Referee
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2International Max Planck Research School, MPI for Informatics, Max Planck Society, Campus E1 4, 66123 Saarbrücken, DE, ou_1116551              
3External Organizations, ou_persistent22              

Content

show
hide
Free keywords: -
 Abstract: Web archives include both archives of contents originally published on the Web (e.g., the Internet Archive) but also archives of contents published long ago that are now accessible on the Web (e.g., the archive of The Times). Thanks to the increased awareness that web-born contents are worth preserving and to improved digitization techniques, web archives have grown in number and size. To unfold their full potential, search techniques are needed that consider their inherent special characteristics. This work addresses three important problems toward this objective and makes the following contributions: * We present the Time-Travel Inverted indeX (TTIX) as an efficient solution to time-travel text search in web archives, allowing users to search only the parts of the web archive that existed at a user's time of interest. * To counter negative effects that terminology evolution has on the quality of search results in web archives, we propose a novel query-reformulation technique, so that old but highly relevant documents are retrieved in response to today's queries. * For temporal information needs, for which the user is best satisfied by documents that refer to particular times, we describe a retrieval model that integrates temporal expressions (e.g., ``in the 1990s'') seamlessly into a language modeling approach. Experiments for each of the proposed methods show their efficiency and effectiveness, respectively, and demonstrate the viability of our approach to search in web archives.

Details

show
hide
Language(s): eng - English
 Dates: 2011-02-042010-07-1920102010
 Publication Status: Published in print
 Pages: -
 Publishing info: Saarbrücken : Universität des Saarlandes
 Table of Contents: -
 Rev. Type: -
 Identifiers: eDoc: 536373
URI: http://scidok.sulb.uni-saarland.de/volltexte/2010/3281/
Other: Local-ID: C1256DBF005F876D-05A4D1CFDEB5957FC125776E002452A5-Berberich2010
 Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show