hide
Free keywords:
Computer Science, Information Retrieval, cs.IR
Abstract:
Deep Learning Hard (DL-HARD) is a new annotated dataset designed to more
effectively evaluate neural ranking models on complex topics. It builds on TREC
Deep Learning (DL) topics by extensively annotating them with question intent
categories, answer types, wikified entities, topic categories, and result type
metadata from a commercial web search engine. Based on this data, we introduce
a framework for identifying challenging queries. DL-HARD contains fifty topics
from the official DL 2019/2020 evaluation benchmark, half of which are newly
and independently assessed. We perform experiments using the official submitted
runs to DL on DL-HARD and find substantial differences in metrics and the
ranking of participating systems. Overall, DL-HARD is a new resource that
promotes research on neural ranking methods by focusing on challenging and
complex topics.