User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse




Conference Paper

Scalable Join Processing on Very Large RDF Graphs


Neumann,  Thomas
Databases and Information Systems, MPI for Informatics, Max Planck Society;


Weikum,  Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

There are no locators available
Fulltext (public)
There are no public fulltexts available
Supplementary Material (public)
There is no public supplementary material available

Neumann, T., & Weikum, G. (2009). Scalable Join Processing on Very Large RDF Graphs. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (pp. 627-640). New York, NY: ACM.

Cite as: http://hdl.handle.net/11858/00-001M-0000-000F-1948-3
With the proliferation of the RDF data format, engines for RDF query processing are faced with very large graphs that contain hundreds of millions of RDF triples. This paper addresses the resulting scalability problems. Recent prior work along these lines has focused on indexing and other physical-design issues. The current paper focuses on join processing, as the fine-grained and schema-relaxed use of RDF often entails star- and chain-shaped join queries with many input streams from index scans. We present two contributions for scalable join processing. First, we develop very light-weight methods for sideways information passing between separate joins at query run-time, to provide highly effective filters on the input streams of joins. Second, we improve previously proposed algorithms for join-order optimization by more accurate selectivity estimations for very large RDF graphs. Experimental studies with several RDF datasets, including the UniProt collection, demonstrate the performance gains of our approach, outperforming the previously fastest systems by more than an order of magnitude.