Efficient Text Proximity Search

Schenkel, Ralf; Broschart, Andreas; Hwang, Seungwon; Theobald, Martin; Weikum, Gerhard

doi:10.1007/978-3-540-75530-2_26

アイテム詳細

登録内容を編集ファイル形式で保存

一時保存へ追加

タグ情報を表示リリース履歴を表示詳細要約

公開

会議論文

Efficient Text Proximity Search

MPS-Authors

/persons/resource/persons45380

Schenkel, Ralf
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons44188

Broschart, Andreas
Databases and Information Systems, MPI for Informatics, Max Planck Society;
International Max Planck Research School, MPI for Informatics, Max Planck Society;

/persons/resource/persons44664

Hwang, Seungwon
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45609

Theobald, Martin
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45720

Weikum, Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource

There are no locators available

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

フルテキスト (公開)

公開されているフルテキストはありません

付随資料 (公開)

There is no public supplementary material available

引用

Schenkel, R., Broschart, A., Hwang, S., Theobald, M., & Weikum, G. (2007). Efficient Text Proximity Search. In N., Ziviani, & R. A., Baeza-Yates (Eds.), String Processing and Information Retrieval: 14th International Symposium, SPIRE 2007 (pp. 287-299). Berlin, Germany: Springer.

引用: https://hdl.handle.net/11858/00-001M-0000-000F-1F05-B

要旨

In addition to purely occurrence-based relevance models, term proximity has been frequently used to enhance retrieval quality of keyword-oriented retrieval systems. While there have been approaches on effective scoring functions that incorporate proximity, there has not been much work on algorithms or access methods for their efficient evaluation. This paper presents an efficient evaluation framework including a proximity scoring function integrated within a top-k query engine for text retrieval. We propose precomputed and materialized index structures that boost performance. The increased retrieval effectiveness and efficiency of our framework are demonstrated through extensive experiments on a very large text benchmark collection. In combination with static index pruning for the proximity lists, our algorithm achieves an improvement of two orders of magnitude compared to a term-based top-k evaluation, with a significantly improved result quality.