RDFMatView: Indexing RDF Data for SPARQL Queries

The Semantic Web is now gaining momentum due to its efforts to create a universal medium for the exchange of semantically tagged data. The representation and querying of semantic data have been made by means of directed labelled graphs using RDF and SPARQL, standards which have been widely accepted by the scientific community. Currently, most implementations of RDF/SPARQL are based on relational database technology. But executing complex queries in these systems usually is rather slow due to the number of joins that need to be performed. In this article, we describe an indexing method using materialized SPARQL queries as indexes on RDF data sets to reduce the query processing time. We provide a formal definition of materialized SPARQL queries, a cost model to evaluate their impact on query performance, a storage scheme for the materialization, and an algorithm to find the optimal set of indexes given a query. We also introduce different approaches to integrate materialized queries into an existing SPARQL query engine.