非表示:
キーワード:
-
要旨:
This paper presents the results of our INEX 2009 Ad-hoc and Efficiency track
experiments. While our scoring model remained almost unchanged in comparison to
previous years, we focused on a complete redesign of our XML indexing component
with respect to the increased need for scalability that came with the new 2009
INEX Wikipedia collection, which is about 10 times larger than the previous
INEX collection. TopX now supports a CAS-specific distributed index structure,
with a completely {\em parallel} execution of all indexing steps, including
parsing, sampling of term statistics for our element-specific BM25 ranking
model, as well as sorting and compressing the index lists for our final
inverted block-index. Overall, TopX ranked among the top 3 systems in both the
Ad-hoc and Efficiency tracks, with a maximum value of 0.61 for iP[0.01] and
0.29 for MAiP in focused retrieval mode at the Ad-hoc track. Our fastest runs
achieved an average runtime of 72 ms per CO query, and 235 ms per CAS query at
the Efficiency track, respectively.