English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Global Document Frequency Estimation in Peer-to-Peer Web Search

Bender, M., Michel, S., Triantafillou, P., & Weikum, G. (2006). Global Document Frequency Estimation in Peer-to-Peer Web Search. In 9th International Workshop on the Web and Databases (WebDB 2006) @ SIGMOD2006 (pp. 69-74). n/a: n/a.

Item is

Files

show Files
hide Files
:
WebDB06.pdf (Any fulltext), 216KB
 
File Permalink:
-
Name:
WebDB06.pdf
Description:
-
OA-Status:
Visibility:
Private
MIME-Type / Checksum:
application/pdf
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

show
hide
 Creators:
Bender, Matthias1, Author           
Michel, Sebastian1, Author           
Triantafillou, Peter1, Author           
Weikum, Gerhard1, Author           
Zhou, Dayou, Editor
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              

Content

show
hide
Free keywords: -
 Abstract: Information retrieval (IR) in peer-to-peer (P2P) networks, where the corpus is spread across many loosely coupled peers, has recently gained importance. In contrast to IR systems on a centralized server or server farm, P2P IR faces the additional challenge of either being oblivious to global corpus statistics or having to compute the global measures from local statistics at the individual peers in an efficient, distributed manner. One specific measure of interest is the global document frequency for different terms, which would be very beneficial as term-specific weights in the scoring and ranking of merged search results that have been obtained from different peers. This paper presents an efficient solution for the problem of estimating global document frequencies in a large-scale P2P network with very high dynamics where peers can join and leave the network on short notice. In particular, the developed method takes into account the fact that the local document collections of autonomous peers may arbitrarily overlap, so that global counting needs to be duplicate-insensitive. The method is based on hash sketches as a technique for compact data synopses. Experimental studies demonstrate the estimator's accuracy, scalability, and ability to cope with high dynamics. Moreover, the benefit for ranking P2P search results is shown by experiments with real-world Web data and queries.

Details

show
hide
Language(s): eng - English
 Dates: 2007-04-272006
 Publication Status: Issued
 Pages: -
 Publishing info: n/a : n/a
 Table of Contents: -
 Rev. Type: -
 Identifiers: eDoc: 314463
Other: Local-ID: C1256DBF005F876D-1074E8517E3FAF65C12571B8004C7FA1-WebDB06
 Degree: -

Event

show
hide
Title: Untitled Event
Place of Event: Chicago, USA
Start-/End Date: 2006-05-30

Legal Case

show

Project information

show

Source 1

show
hide
Title: 9th International Workshop on the Web and Databases (WebDB 2006) @ SIGMOD2006
Source Genre: Proceedings
 Creator(s):
Affiliations:
Publ. Info: n/a : n/a
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 69 - 74 Identifier: -