Enriching Knowledge Bases with Counting Quantifiers

Mirza, Paramita; Razniewski, Simon; Darari, Fariz; Weikum, Gerhard

Lokale TagsFreigabegeschichteDetailsÜbersicht

Enriching Knowledge Bases with Counting Quantifiers

Mirza, P., Razniewski, S., Darari, F., & Weikum, G. (2018). Enriching Knowledge Bases with Counting Quantifiers. Retrieved from http://arxiv.org/abs/1807.03656.

Item is Freigegeben

einblenden: alle ausblenden: alle

Basisdaten

einblenden: ausblenden:

Datensatz-Permalink: https://hdl.handle.net/21.11116/0000-0001-E16D-7 Versions-Permalink: https://hdl.handle.net/21.11116/0000-0001-E16E-6

Genre: Forschungspapier

Dateien

einblenden: Dateien

ausblenden: Dateien

:

arXiv:1807.03656.pdf (Preprint), 387KB

Öffnen Speichern

Datei-Permalink:
https://hdl.handle.net/21.11116/0000-0001-E16F-5

Name:
arXiv:1807.03656.pdf

Beschreibung:
File downloaded from arXiv at 2018-08-06 08:52 The 17th International Semantic Web Conference (ISWC 2018)

OA-Status:

Sichtbarkeit:
Öffentlich

MIME-Typ / Prüfsumme:
application/pdf / [MD5]

Technische Metadaten:

Öffnen

Copyright Datum:
-

Copyright Info:
-

Lizenz:
http://arxiv.org/help/license

Externe Referenzen

einblenden:

Urheber

einblenden:

ausblenden:

Urheber:
Mirza, Paramita¹, Autor
Razniewski, Simon¹, Autor
Darari, Fariz², Autor
Weikum, Gerhard¹, Autor

Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018
2External Organizations, ou_persistent22

Inhalt

einblenden:

ausblenden:

Schlagwörter: Computer Science, Computation and Language, cs.CL

Zusammenfassung: Information extraction traditionally focuses on extracting relations between identifiable entities, such as <Monterey, locatedIn, California>. Yet, texts often also contain Counting information, stating that a subject is in a specific relation with a number of objects, without mentioning the objects themselves, for example, "California is divided into 58 counties". Such counting quantifiers can help in a variety of tasks such as query answering or knowledge base curation, but are neglected by prior work. This paper develops the first full-fledged system for extracting counting information from text, called CINEX. We employ distant supervision using fact counts from a knowledge base as training seeds, and develop novel techniques for dealing with several challenges: (i) non-maximal training seeds due to the incompleteness of knowledge bases, (ii) sparse and skewed observations in text sources, and (iii) high diversity of linguistic patterns. Experiments with five human-evaluated relations show that CINEX can achieve 60% average precision for extracting counting information. In a large-scale experiment, we demonstrate the potential for knowledge base enrichment by applying CINEX to 2,474 frequent relations in Wikidata. CINEX can assert the existence of 2.5M facts for 110 distinct relations, which is 28% more than the existing Wikidata facts for these relations.

Details

einblenden:

ausblenden:

Sprache(n): eng - English

Datum: Erstellt: 2018-07-10Online veröffentlicht: 2018

Publikationsstatus: Online veröffentlicht

Seiten: 16 p.

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: arXiv: 1807.03656
URI: http://arxiv.org/abs/1807.03656
BibTex Citekey: Mirza_arXiv:1807.03656

Art des Abschluß: -

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle