Pretrained Transformers for Text Ranking: BERT and Beyond

Lin, Jimmy; Nogueira, Rodrigo; Yates, Andrew

Lokale TagsFreigabegeschichteDetailsÜbersicht

Pretrained Transformers for Text Ranking: BERT and Beyond

Lin, J., Nogueira, R., & Yates, A. (2020). Pretrained Transformers for Text Ranking: BERT and Beyond. Retrieved from https://arxiv.org/abs/2010.06467.

Item is Freigegeben

einblenden: alle ausblenden: alle

Basisdaten

einblenden: ausblenden:

Datensatz-Permalink: https://hdl.handle.net/21.11116/0000-0008-06DA-C Versions-Permalink: https://hdl.handle.net/21.11116/0000-0008-06DB-B

Genre: Forschungspapier

Dateien

einblenden: Dateien

ausblenden: Dateien

:

arXiv:2010.06467.pdf (Preprint), 5MB

Öffnen Speichern

Datei-Permalink:
https://hdl.handle.net/21.11116/0000-0008-06DC-A

Name:
arXiv:2010.06467.pdf

Beschreibung:
File downloaded from arXiv at 2021-02-22 13:38

OA-Status:

Sichtbarkeit:
Öffentlich

MIME-Typ / Prüfsumme:
application/pdf / [MD5]

Technische Metadaten:

Öffnen

Copyright Datum:
-

Copyright Info:
-

Lizenz:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

Externe Referenzen

einblenden:

Urheber

einblenden:

ausblenden:

Urheber:
Lin, Jimmy¹, Autor
Nogueira, Rodrigo¹, Autor
Yates, Andrew², Autor

Affiliations:
1External Organizations, ou_persistent22
2Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018

Inhalt

einblenden:

ausblenden:

Schlagwörter: Computer Science, Information Retrieval, cs.IR,Computer Science, Computation and Language, cs.CL

Zusammenfassung: The goal of text ranking is to generate an ordered list of texts retrieved
from a corpus in response to a query. Although the most common formulation of
text ranking is search, instances of the task can also be found in many natural
language processing applications. This survey provides an overview of text
ranking with neural network architectures known as transformers, of which BERT
is the best-known example. The combination of transformers and self-supervised
pretraining has, without exaggeration, revolutionized the fields of natural
language processing (NLP), information retrieval (IR), and beyond. In this
survey, we provide a synthesis of existing work as a single point of entry for
practitioners who wish to gain a better understanding of how to apply
transformers to text ranking problems and researchers who wish to pursue work
in this area. We cover a wide range of modern techniques, grouped into two
high-level categories: transformer models that perform reranking in multi-stage
ranking architectures and learned dense representations that attempt to perform
ranking directly. There are two themes that pervade our survey: techniques for
handling long documents, beyond the typical sentence-by-sentence processing
approaches used in NLP, and techniques for addressing the tradeoff between
effectiveness (result quality) and efficiency (query latency). Although
transformer architectures and pretraining techniques are recent innovations,
many aspects of how they are applied to text ranking are relatively well
understood and represent mature techniques. However, there remain many open
research questions, and thus in addition to laying out the foundations of
pretrained transformers for text ranking, this survey also attempts to
prognosticate where the field is heading.

Details

einblenden:

ausblenden:

Sprache(n): eng - English

Datum: Erstellt: 2020-10-13Online veröffentlicht: 2020

Publikationsstatus: Online veröffentlicht

Seiten: 155 p.

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: arXiv: 2010.06467
URI: https://arxiv.org/abs/2010.06467
BibTex Citekey: Lin2010.06467

Art des Abschluß: -

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle