English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  ComQA: A Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters

Abujabal, A., Roy, R. S., Yahya, M., & Weikum, G. (2018). ComQA: A Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters. Retrieved from http://arxiv.org/abs/1809.09528.

Item is

Basic

show hide
Genre: Paper
Latex : {ComQA}: {A} Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters

Files

show Files
hide Files
:
arXiv:1809.09528.pdf (Preprint), 598KB
Name:
arXiv:1809.09528.pdf
Description:
File downloaded from arXiv at 2018-12-07 09:00
OA-Status:
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-

Locators

show

Creators

show
hide
 Creators:
Abujabal, Abdalghani1, Author           
Roy, Rishiraj Saha1, Author           
Yahya, Mohamed2, Author           
Weikum, Gerhard1, Author           
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2External Organizations, ou_persistent22              

Content

show
hide
Free keywords: Computer Science, Computation and Language, cs.CL
 Abstract: To bridge the gap between the capabilities of the state-of-the-art in factoid
question answering (QA) and what real users ask, we need large datasets of real
user questions that capture the various question phenomena users are interested
in, and the diverse ways in which these questions are formulated. We introduce
ComQA, a large dataset of real user questions that exhibit different
challenging aspects such as temporal reasoning, compositionality, etc. ComQA
questions come from the WikiAnswers community QA platform. Through a large
crowdsourcing effort, we clean the question dataset, group questions into
paraphrase clusters, and annotate clusters with their answers. ComQA contains
11,214 questions grouped into 4,834 paraphrase clusters. We detail the process
of constructing ComQA, including the measures taken to ensure its high quality
while making effective use of crowdsourcing. We also present an extensive
analysis of the dataset and the results achieved by state-of-the-art systems on
ComQA, demonstrating that our dataset can be a driver of future research on QA.

Details

show
hide
Language(s): eng - English
 Dates: 2018-09-252018
 Publication Status: Published online
 Pages: 11 p.
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: arXiv: 1809.09528
URI: http://arxiv.org/abs/1809.09528
BibTex Citekey: Abujabal_arXiv1809.09528
 Degree: -

Event

show

Legal Case

show

Project information

show

Source

show