Table of Contents

SPARQL query processing with Apache Spark

This wiki is a companion to the following publications:

It provides access to the resources related to the evaluation section of SPARQL query processing with Apache Spark.

See also RDFdist concerning RDF distribution approaches using Spark.

Data sets

Query processing

WatDiv queries

Query S1

SELECT ?v0 ?v1 ?v3 ?v4 ?v5 ?v6 ?v7 ?v8 ?v9 WHERE {
?v0 gr:includes ?v1 . %v2% gr:offers ?v0 .
?v0 gr:price ?v3 . ?v0 gr:serialNumber ?v4 .
?v0 gr:validFrom ?v5 . ?v0 gr:validThrough ?v6 .
?v0 sorg:eligibleQuantity ?v7 .
?v0 sorg:eligibleRegion ?v8 .
?v0 sorg:priceValidUntil ?v9 . }

See WatDiv Query S1 plans

Query F5

SELECT ?v0 ?v1 ?v3 ?v4 ?v5 ?v6 WHERE {
?v0 gr:includes ?v1 . %v2% gr:offers ?v0 .
?v0 gr:price ?v3 . ?v0 gr:validThrough ?v4 .
?v1 og:title ?v5 . ?v1 rdf:type ?v6 . }

See WatDiv Query F5 plans

Execution reports for F5
Plan Execution report
SPARQL DF SPARQL DF
SPARQL Hybrid SPARQL Hybrid
S2RDF S2RDF
S2RDF+Hybrid S2RDF+Hybrid

Query C3

SELECT ?v0 WHERE {
?v0 wsdbm:likes ?v1 . ?v0 wsdbm:friendOf ?v2 .
?v0 dc:Location ?v3 . ?v0 foaf:age ?v4 .
?v0 wsdbm:gender ?v5 . ?v0 foaf:givenName ?v6 . }

See WatDiv Query C3 plans

Star queries

Star queries over the DrugBank dataset

Star with 3 branches

SELECT ?x ?a ?b
WHERE {
 ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>.
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a .
 ?x  <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b.}

Star with 5 branches

SELECT ?x ?a ?b ?c ?d
WHERE {
 ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>.
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a .
 ?x  <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . 
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d .
}

Star with 10 branches

SELECT ?x ?a ?b ?c ?d ?g ?h ?i
WHERE {
 ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>.
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a .
 ?x  <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . 
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/pharmacology> ?e.
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/mechanismOfAction> ?f
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/predictedLogs> ?g .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/halfLife> ?h .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/dpdDrugIdNumber> ?i .
}

Star with 15 branches

SELECT ?x ?a ?b ?c ?d ?g ?h ?i ?j ?k ?l
WHERE {
 ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>.
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a .
 ?x  <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . 
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/pharmacology> ?e.
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/mechanismOfAction> ?f
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/predictedLogs> ?g .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/halfLife> ?h .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/dpdDrugIdNumber> ?i .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/contraindicationInsert> ?j .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/interactionInsert> ?k .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/structure> ?l.
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/state> ?m .
 ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/rxlistLink> <http://www.rxlist.com/cgi/generic/ibup.htm> .}

See Star shape query plans

Chain queries

Chain queries over DBPedia data set.

Chain4 query

Chain4 is

SELECT ?x1, ?x2, ?x3, ?x4, ?x5  WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 }

with properties

val P1 = 1389363200
val P2 = 52239
val P3 = 1164541952
val P4 = 1164156928

See Chain4 query plans

Chain6 query

SELECT ?x1, ?x2, ?x3, ?x4, ?x5, ?x6, ?x7   WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 . ?x5 P5 ?x6 . ?x6 P6 ?x7 }

with properties

val P1 = 18843
val P2 = 5540
val P3 = 1179222016
val P4 = 1446076416
val P5 = 1446244352
val P6 = 36363

See Chain6 query plans

Snowflake queries

SPARQL for Q8 from LUBM test suite

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ub: <http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl#>
SELECT ?X, ?Y, ?Z
WHERE
{?X rdf:type ub:Student .
  ?Y rdf:type ub:Department .
  ?X ub:memberOf ?Y .
  ?Y ub:subOrganizationOf <http://www.University0.edu> .
  ?X ub:emailAddress ?Z}

See SnowFlake query Q8 plans

Misc

Utility tools