Reinforcement Learning from Reformulations in Conversational Question Answering 
over Knowledge Graphs

Kaiser, Magdalena; Saha Roy, Rishiraj; Weikum, Gerhard

Local TagsRelease HistoryDetailsSummary

Reinforcement Learning from Reformulations in Conversational Question Answering over Knowledge Graphs

Kaiser, M., Saha Roy, R., & Weikum, G. (2021). Reinforcement Learning from Reformulations in Conversational Question Answering over Knowledge Graphs. Retrieved from https://arxiv.org/abs/2105.04850.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0009-67C9-1 Version Permalink: https://hdl.handle.net/21.11116/0000-0009-67CA-0

Genre: Paper

Files

show Files

hide Files

:

arXiv:2105.04850.pdf (Preprint), 978KB

View Save

File Permalink:
https://hdl.handle.net/21.11116/0000-0009-67CB-F

Name:
arXiv:2105.04850.pdf

Description:
File downloaded from arXiv at 2021-10-26 13:42 SIGIR 2021 Long Paper, 11 pages

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

Locators

show

Creators

show

hide

Creators:
Kaiser, Magdalena¹, Author
Saha Roy, Rishiraj¹, Author
Weikum, Gerhard¹, Author

Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018

Content

show

hide

Free keywords: Computer Science, Information Retrieval, cs.IR,Computer Science, Computation and Language, cs.CL

Abstract: The rise of personal assistants has made conversational question answering
(ConvQA) a very popular mechanism for user-system interaction. State-of-the-art
methods for ConvQA over knowledge graphs (KGs) can only learn from crisp
question-answer pairs found in popular benchmarks. In reality, however, such
training data is hard to come by: users would rarely mark answers explicitly as
correct or wrong. In this work, we take a step towards a more natural learning
paradigm - from noisy and implicit feedback via question reformulations. A
reformulation is likely to be triggered by an incorrect system response,
whereas a new follow-up question could be a positive signal on the previous
turn's answer. We present a reinforcement learning model, termed CONQUER, that
can learn from a conversational stream of questions and reformulations. CONQUER
models the answering process as multiple agents walking in parallel on the KG,
where the walks are determined by actions sampled using a policy network. This
policy network takes the question along with the conversational context as
inputs and is trained via noisy rewards obtained from the reformulation
likelihood. To evaluate CONQUER, we create and release ConvRef, a benchmark
with about 11k natural conversations containing around 205k reformulations.
Experiments show that CONQUER successfully learns to answer conversational
questions from noisy reward signals, significantly improving over a
state-of-the-art baseline.

Details

show

hide

Language(s): eng - English

Dates: Created: 2021-05-11Modified: 2021-08-20Published Online: 2021

Publication Status: Published online

Pages: 11 p.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: arXiv: 2105.04850
URI: https://arxiv.org/abs/2105.04850
BibTex Citekey: Kaiser_2105.04850

Degree: -

Event

show

Legal Case

show

Project information

show

Source

show