Content-Based Weak Supervision for Ad-Hoc Re-Ranking

MacAvaney, Sean; Yates, Andrew; Hui, Kai; Frieder, Ophir

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Forschungspapier

Content-Based Weak Supervision for Ad-Hoc Re-Ranking

MPG-Autoren

/persons/resource/persons206666

Yates, Andrew
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

arXiv:1707.00189.pdf
(Preprint), 138KB

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

MacAvaney, S., Yates, A., Hui, K., & Frieder, O. (2019). Content-Based Weak Supervision for Ad-Hoc Re-Ranking. Retrieved from http://arxiv.org/abs/1707.00189.

Zitierlink: https://hdl.handle.net/21.11116/0000-0005-6B59-0

Zusammenfassung

One challenge with neural ranking is the need for a large amount of
manually-labeled relevance judgments for training. In contrast with prior work,
we examine the use of weak supervision sources for training that yield pseudo
query-document pairs that already exhibit relevance (e.g., newswire
headline-content pairs and encyclopedic heading-paragraph pairs). We also
propose filtering techniques to eliminate training samples that are too far out
of domain using two techniques: a heuristic-based approach and novel supervised
filter that re-purposes a neural ranker. Using several leading neural ranking
architectures and multiple weak supervision datasets, we show that these
sources of training pairs are effective on their own (outperforming prior weak
supervision techniques), and that filtering can further improve performance.