hide
Free keywords:
-
Abstract:
Today's content providers are naturally distributed and produce large amounts
of information every day, making peer-to-peer data management a promising
approach offering scalability, adaptivity to dynamics, and failure resilience.
In such systems, subscribing with a continuous query is of equal importance as
one-time querying since it allows the user to cope with the high rate of
information production and avoid the cognitive overload of repeated searches.
In the information filtering setting users specify continuous queries, thus
subscribing to newly appearing documents satisfying the query conditions.
Contrary to existing approaches providing exact information filtering
functionality, this doctoral thesis introduces the concept of approximate
information filtering, where users subscribe to only a few selected sources
most likely to satisfy their information demand. This way, efficiency and
scalability are enhanced by trading a small reduction in recall for lower
message traffic.
This thesis contains the following contributions: (i) the first architecture to
support approximate information filtering in structured peer-to-peer networks,
(ii) novel strategies to select the most appropriate publishers by taking into
account correlations among keywords, (iii) a prototype implementation for
approximate information retrieval and filtering, and (iv) a digital library use
case to demonstrate the integration of retrieval and filtering in a unified
system.