English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Paper

Beyond NED: Fast and Effective Search Space Reduction for Complex Question Answering over Knowledge Bases

MPS-Authors
/persons/resource/persons244397

Christmann,  Phlipp
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons185343

Saha Roy,  Rishiraj
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45720

Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (public)

arXiv:2108.08597.pdf
(Preprint), 2MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Christmann, P., Saha Roy, R., & Weikum, G. (2021). Beyond NED: Fast and Effective Search Space Reduction for Complex Question Answering over Knowledge Bases. Retrieved from https://arxiv.org/abs/2108.08597.


Cite as: http://hdl.handle.net/21.11116/0000-0009-6360-B
Abstract
Answering complex questions over knowledge bases (KB-QA) faces huge input data with billions of facts, involving millions of entities and thousands of predicates. For efficiency, QA systems first reduce the answer search space by identifying a set of facts that is likely to contain all answers and relevant cues. The most common technique or doing this is to apply named entity disambiguation (NED) systems to the question, and retrieve KB facts for the disambiguated entities. This work presents CLOCQ, an efficient method that prunes irrelevant parts of the search space using KB-aware signals. CLOCQ uses a top-k query processor over score-ordered lists of KB items that combine signals about lexical matching, relevance to the question, coherence among candidate items, and connectivity in the KB graph. Experiments with two recent QA benchmarks for complex questions demonstrate the superiority of CLOCQ over state-of-the-art baselines with respect to answer presence, size of the search space, and runtimes.