CEQE: Contextualized Embeddings for Query Expansion

Naseri, Shahrzad; Dalton, Jeffrey; Yates, Andrew; Allan, James

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Paper

CEQE: Contextualized Embeddings for Query Expansion

MPS-Authors

/persons/resource/persons206666

Yates, Andrew
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

arXiv:2103.05256.pdf
(Preprint), 189KB

Supplementary Material (public)

There is no public supplementary material available

Citation

Naseri, S., Dalton, J., Yates, A., & Allan, J. (2021). CEQE: Contextualized Embeddings for Query Expansion. Retrieved from https://arxiv.org/abs/2103.05256.

Cite as: https://hdl.handle.net/21.11116/0000-0009-6779-C

Abstract

In this work we leverage recent advances in context-sensitive language models
to improve the task of query expansion. Contextualized word representation
models, such as ELMo and BERT, are rapidly replacing static embedding models.
We propose a new model, Contextualized Embeddings for Query Expansion (CEQE),
that utilizes query-focused contextualized embedding vectors. We study the
behavior of contextual representations generated for query expansion in ad-hoc
document retrieval. We conduct our experiments on probabilistic retrieval
models as well as in combination with neural ranking models. We evaluate CEQE
on two standard TREC collections: Robust and Deep Learning. We find that CEQE
outperforms static embedding-based expansion methods on multiple collections
(by up to 18% on Robust and 31% on Deep Learning on average precision) and also
improves over proven probabilistic pseudo-relevance feedback (PRF) models. We
further find that multiple passes of expansion and reranking result in
continued gains in effectiveness with CEQE-based approaches outperforming other
approaches. The final model incorporating neural and CEQE-based expansion score
achieves gains of up to 5% in P@20 and 2% in AP on Robust over the
state-of-the-art transformer-based re-ranking model, Birch.