ausblenden:
Schlagwörter:
-
Zusammenfassung:
Distributed top-k query processing is increasingly becoming an essential
functionality in a large number of emerging application classes. This paper
addresses the efficient algebraic optimization of top-k queries in wide-area
distributed data repositories where the index lists for the attribute values
(or text terms) of a query are distributed across a number of data peers and
the computational costs include network latency, bandwidth consumption, and
local peer work. We use a dynamic programming approach to find the optimal
execution plan using compact data synopses for selectivity estimation that is
the basis for our cost model. The optimized query is executed in a hierarchical
way involving a small and fixed number of communication phases. We have
performed experiments on real web data that show the benefits of distributed
top-k query optimization both in network resource consumption and query
response time.