ABSTRACT

The ever-growing volume of linked data calls for the efficient processing of SPARQL – the de-facto standard query language for Semantic Web — over RDF datasets. In this chapter we review the state of the art of centralized RDF query processing. We describe the challenges raised by large-scale RDF datasets, and analyze different attempts to overcome them. The chapter covers a wide array of approaches towards storing and indexing RDF data and processing SPARQL queries. We discuss the traditional database problems like join ordering and selectivity estimation, illustrate them with queries against real-world RDF datasets, and explain ways to solve them in the context of RDF databases. We also describe the RDF-specific efficient join and path traversal processing.