Automatic Methods for Low-Cost Evaluation and Position-Aware Models for Neural 
Information Retrieval

Hui, Kai

doi:10.22028/D291-26942

Automatic Methods for Low-Cost Evaluation and Position-Aware Models for Neural Information Retrieval

Hui, K. (2017). Automatic Methods for Low-Cost Evaluation and Position-Aware Models for Neural Information Retrieval. PhD Thesis, Universität des Saarlandes, Saarbrücken.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/11858/00-001M-0000-002E-8921-E 版のパーマリンク: https://hdl.handle.net/21.11116/0000-000C-6EFF-B

資料種別: 学位論文

ファイル

表示: ファイル

作成者

表示:

非表示:

作成者:
Hui, Kai^{1, 2}, 著者
Berberich, Klaus¹, 学位論文主査
Weikum, Gerhard¹, 監修者
Dietz, Laura¹, 監修者

所属:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018
2International Max Planck Research School, MPI for Informatics, Max Planck Society, Campus E1 4, 66123 Saarbrücken, DE, ou_1116551

内容説明

表示:

非表示:

キーワード: -

要旨: An information retrieval (IR) system assists people in consuming huge amount of data, where the evaluation and the construction of such systems are important. However, there exist two difficulties: the overwhelmingly large number of query-document pairs to judge, making IR evaluation a manually laborious task; and the complicated patterns to model due to the non-symmetric, heterogeneous relationships between a query-document pair, where different interaction patterns such as term dependency and proximity have been demonstrated to be useful, yet are non-trivial for a single IR model to encode. In this thesis we attempt to address both difficulties from the perspectives of IR evaluation and of the retrieval model respectively, by reducing the manual cost with automatic methods, by investigating the usage of crowdsourcing in collecting preference judgments, and by proposing novel neural retrieval models. In particular, to address the large number of query-document pairs in IR evaluation, a low-cost selective labeling method is proposed to pick out a small subset of representative documents for manual judgments in favor of the follow-up prediction for the remaining query-document pairs; furthermore, a language-model based cascade measure framework is developed to evaluate the novelty and diversity, utilizing the content of the labeled documents to mitigate incomplete labels. In addition, we also attempt to make the preference judgments practically usable by empirically investigating different properties of the judgments when collected via crowdsourcing; and by proposing a novel judgment mechanism, making a compromise between the judgment quality and the number of judgments. Finally, to model different complicated patterns in a single retrieval model, inspired by the recent advances in deep learning, we develop novel neural IR models to incorporate different patterns like term dependency, query proximity, density of relevance, and query coverage in a single model. We demonstrate their superior performances through evaluations on different datasets.

資料詳細

表示:

非表示:

言語: eng - English

日付: 受理: 2017-12-04オンライン出版: 2017

出版の状態: オンラインで出版済み

ページ: xiv, 130 p.

出版情報: Saarbrücken : Universität des Saarlandes

目次: -

査読: -

識別子（DOI, ISBNなど）: BibTex参照ID: HUiphd2017
URN: urn:nbn:de:bsz:291-scidok-ds-269423
DOI: 10.22028/D291-26942
その他: hdl:20.500.11880/26894

学位: 博士号 (PhD)

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物