URDF: Efficient Reasoning in Uncertain RDF Knowledge Bases with Soft and Hard 
Rules

Theobald, Martin; Sozio, Mauro; Suchanek, Fabian; Nakashole, Ndapandula

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Report

URDF: Efficient Reasoning in Uncertain RDF Knowledge Bases with Soft and Hard Rules

MPS-Authors

/persons/resource/persons45609

Theobald, Martin
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45527

Sozio, Mauro
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45572

Suchanek, Fabian
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45101

Nakashole, Ndapandula
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

urdf-report.pdf
(Any fulltext), 504KB

Supplementary Material (public)

There is no public supplementary material available

Citation

Theobald, M., Sozio, M., Suchanek, F., & Nakashole, N.(2010). URDF: Efficient Reasoning in Uncertain RDF Knowledge Bases with Soft and Hard Rules (MPI-I-2010-5-002). Saarbrücken: Max-Planck-Institut für Informatik. Retrieved from http://domino.mpi-inf.mpg.de/internet/reports.nsf/NumberView/2010-5-002.

Cite as: https://hdl.handle.net/11858/00-001M-0000-000F-1556-3

Abstract

We present URDF, an efficient reasoning framework for graph-based, nonschematic RDF knowledge bases and SPARQL-like queries. URDF augments first-order reasoning by a combination of soft rules, with Datalog-style recursive implications, and hard rules, in the shape of mutually exclusive sets of facts. It incorporates the common possible worlds semantics with independent base facts as it is prevalent in most probabilistic database approaches, but also supports semantically more expressive, probabilistic first-order representations such as Markov Logic Networks. As knowledge extraction on theWeb often is an iterative (and inherently noisy) process, URDF explicitly targets the resolution of inconsistencies between the underlying RDF base facts and the inference rules. Core of our approach is a novel and efficient approximation algorithm for a generalized version of the Weighted MAX-SAT problem, allowing us to dynamically resolve such inconsistencies directly at query processing time. Our MAX-SAT algorithm has a worst-case running time of O(jCj jSj), where jCj and jSj denote the number of facts in grounded soft and hard rules, respectively, and it comes with tight approximation guarantees with respect to the shape of the rules and the distribution of confidences of facts they contain. Experiments over various benchmark settings confirm a high robustness and significantly improved runtime of our reasoning framework in comparison to state-of-the-art techniques for MCMC sampling such as MAP inference and MC-SAT. Keywords