English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Predicting Document Coverage for Relation Extraction

Singhania, S., Razniewski, S., & Weikum, G. (2021). Predicting Document Coverage for Relation Extraction. Retrieved from https://arxiv.org/abs/2111.13611.

Item is

Files

show Files
hide Files
:
arXiv:2111.13611.pdf (Preprint), 502KB
Name:
arXiv:2111.13611.pdf
Description:
File downloaded from arXiv at 2022-03-24 07:39 To appear in TACL. The arXiv version is a pre-MIT Press publication version
OA-Status:
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-

Locators

show

Creators

show
hide
 Creators:
Singhania, Sneha1, Author
Razniewski, Simon1, Author           
Weikum, Gerhard1, Author           
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              

Content

show
hide
Free keywords: Computer Science, Computation and Language, cs.CL,Computer Science, Artificial Intelligence, cs.AI
 Abstract: This paper presents a new task of predicting the coverage of a text document
for relation extraction (RE): does the document contain many relational tuples
for a given entity? Coverage predictions are useful in selecting the best
documents for knowledge base construction with large input corpora. To study
this problem, we present a dataset of 31,366 diverse documents for 520
entities. We analyze the correlation of document coverage with features like
length, entity mention frequency, Alexa rank, language complexity and
information retrieval scores. Each of these features has only moderate
predictive power. We employ methods combining features with statistical models
like TF-IDF and language models like BERT. The model combining features and
BERT, HERB, achieves an F1 score of up to 46%. We demonstrate the utility of
coverage predictions on two use cases: KB construction and claim refutation.

Details

show
hide
Language(s): eng - English
 Dates: 2021-11-262021
 Publication Status: Published online
 Pages: 16 p.
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: arXiv: 2111.13611
URI: https://arxiv.org/abs/2111.13611
BibTex Citekey: Singhania2021
 Degree: -

Event

show

Legal Case

show

Project information

show

Source

show