English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Thesis

Automatic Extraction of Facts, Relations, and Entities for Web-scale Knowledge Base Population

MPS-Authors
/persons/resource/persons45101

Nakashole,  Ndapandula
Databases and Information Systems, MPI for Informatics, Max Planck Society;
International Max Planck Research School, MPI for Informatics, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Nakashole, N. (2012). Automatic Extraction of Facts, Relations, and Entities for Web-scale Knowledge Base Population. PhD Thesis, Universität des Saarlandes, Saarbrücken.


Cite as: https://hdl.handle.net/11858/00-001M-0000-0014-627F-A
Abstract
quipping machines with knowledge, through the construction of machine-readable
knowledge bases, presents a key asset for semantic search, machine
translation, question answering, and other formidable challenges in
artificial intelligence. However, human knowledge predominantly resides
in books and other natural language text forms. This means that knowledge
bases must be extracted and synthesized from natural language text.
When the source of text is the Web, extraction methods must cope with
ambiguity, noise, scale, and updates.

The goal of this dissertation is to develop knowledge base population
methods that address the afore mentioned characteristics of Web text. The
dissertation makes three contributions. The first contribution is a method
for mining high-quality facts at scale, through distributed constraint reasoning
and a pattern representation model that is robust against noisy
patterns. The second contribution is a method for mining a large comprehensive
collection of relation types beyond those commonly found in
existing knowledge bases. The third contribution is a method for extracting
facts from dynamic Web sources such as news articles and social media
where one of the key challenges is the constant emergence of new entities.
All methods have been evaluated through experiments involving Web-scale
text collections.