Ci-dessous, les différences entre deux révisions de la page.
Les deux révisions précédentes Révision précédente Prochaine révision | Révision précédente | ||
site:recherche:logiciels:sparqlwithspark [13/09/2016 17:53] hubert [Data sets] |
site:recherche:logiciels:sparqlwithspark [17/11/2023 18:39] amann |
||
---|---|---|---|
Ligne 1: | Ligne 1: | ||
====== SPARQL query processing with Apache Spark ====== | ====== SPARQL query processing with Apache Spark ====== | ||
- | This web page is a companion to the "SPARQL query processing with Apache Spark" paper submitted at EDBT 2017. | + | This wiki is a companion to the following publications: |
+ | * [[https://hal.archives-ouvertes.fr/hal-01502519|SPARQL Graph Pattern Processing with Apache Spark]] | ||
+ | * [[https://arxiv.org/abs/1604.08903|SPARQL query processing with Apache Spark]] | ||
+ | * [[https://hal.archives-ouvertes.fr/hal-01214900|HAQWA: a Hash-based and Query Workload Aware Distributed RDF Store]] | ||
+ | * [[https://hal.archives-ouvertes.fr/hal-01214902|On the Evaluation of RDF Distribution Algorithms Implemented over Apache Spark]] | ||
- | It provides access to some resources related to the evaluation section. | + | It provides access to the resources related to the evaluation section of [[https://arxiv.org/abs/1604.08903|SPARQL query processing with Apache Spark]]. |
+ | |||
+ | See also [[en:site:recherche:logiciels:rdfdist]] concerning RDF distribution approaches using Spark. | ||
===== Data sets ===== | ===== Data sets ===== | ||
* DrugBank | * DrugBank | ||
* DBPedia | * DBPedia | ||
- | * LUBM | + | * LUBM: LU100M, LU1B |
- | * WatDiv | + | * WatDiv: see [[en:site:recherche:logiciels:sparqlwithspark:datasetWatdiv]] |
===== Query processing ===== | ===== Query processing ===== | ||
+ | |||
+ | ==== WatDiv queries ==== | ||
+ | |||
+ | |||
+ | === Query S1 === | ||
+ | <code sparql> | ||
+ | SELECT ?v0 ?v1 ?v3 ?v4 ?v5 ?v6 ?v7 ?v8 ?v9 WHERE { | ||
+ | ?v0 gr:includes ?v1 . %v2% gr:offers ?v0 . | ||
+ | ?v0 gr:price ?v3 . ?v0 gr:serialNumber ?v4 . | ||
+ | ?v0 gr:validFrom ?v5 . ?v0 gr:validThrough ?v6 . | ||
+ | ?v0 sorg:eligibleQuantity ?v7 . | ||
+ | ?v0 sorg:eligibleRegion ?v8 . | ||
+ | ?v0 sorg:priceValidUntil ?v9 . } | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:watDivS1]] | ||
+ | |||
+ | === Query F5 === | ||
+ | <code sparql> | ||
+ | SELECT ?v0 ?v1 ?v3 ?v4 ?v5 ?v6 WHERE { | ||
+ | ?v0 gr:includes ?v1 . %v2% gr:offers ?v0 . | ||
+ | ?v0 gr:price ?v3 . ?v0 gr:validThrough ?v4 . | ||
+ | ?v1 og:title ?v5 . ?v1 rdf:type ?v6 . } | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:watDivF5]] | ||
+ | == Execution reports for F5 == | ||
+ | |||
+ | ^Plan ^ Execution report ^ | ||
+ | |SPARQL DF | {{:en:site:recherche:logiciels:f5_plan_sparql_df.png?300|SPARQL DF}} | | ||
+ | |SPARQL Hybrid| {{:en:site:recherche:logiciels:f5_plan_sparql_hybrid.png?300|SPARQL Hybrid}} | | ||
+ | |S2RDF | {{:en:site:recherche:logiciels:f5_plan_s2rdf.png?300|S2RDF}} | | ||
+ | |S2RDF+Hybrid | {{:en:site:recherche:logiciels:f5_plan_s2rdf_hybrid.png?300|S2RDF+Hybrid}} | | ||
+ | |||
+ | |||
+ | === Query C3 === | ||
+ | <code sparql> | ||
+ | SELECT ?v0 WHERE { | ||
+ | ?v0 wsdbm:likes ?v1 . ?v0 wsdbm:friendOf ?v2 . | ||
+ | ?v0 dc:Location ?v3 . ?v0 foaf:age ?v4 . | ||
+ | ?v0 wsdbm:gender ?v5 . ?v0 foaf:givenName ?v6 . } | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:watDivC3]] | ||
+ | |||
==== Star queries ==== | ==== Star queries ==== | ||
+ | Star queries over the DrugBank dataset | ||
+ | Star with 3 branches | ||
+ | <code sparql> | ||
+ | SELECT ?x ?a ?b | ||
+ | WHERE { | ||
+ | ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b.} | ||
+ | </code> | ||
+ | |||
+ | Star with 5 branches | ||
+ | <code sparql> | ||
+ | SELECT ?x ?a ?b ?c ?d | ||
+ | WHERE { | ||
+ | ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d . | ||
+ | } | ||
+ | </code> | ||
+ | |||
+ | Star with 10 branches | ||
+ | <code sparql> | ||
+ | SELECT ?x ?a ?b ?c ?d ?g ?h ?i | ||
+ | WHERE { | ||
+ | ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/pharmacology> ?e. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/mechanismOfAction> ?f | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/predictedLogs> ?g . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/halfLife> ?h . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/dpdDrugIdNumber> ?i . | ||
+ | } | ||
+ | </code> | ||
+ | |||
+ | Star with 15 branches | ||
+ | <code sparql> | ||
+ | SELECT ?x ?a ?b ?c ?d ?g ?h ?i ?j ?k ?l | ||
+ | WHERE { | ||
+ | ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/pharmacology> ?e. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/mechanismOfAction> ?f | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/predictedLogs> ?g . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/halfLife> ?h . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/dpdDrugIdNumber> ?i . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/contraindicationInsert> ?j . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/interactionInsert> ?k . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/structure> ?l. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/state> ?m . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/rxlistLink> <http://www.rxlist.com/cgi/generic/ibup.htm> .} | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:star]] | ||
==== Chain queries ==== | ==== Chain queries ==== | ||
+ | |||
Chain queries over DBPedia data set. | Chain queries over DBPedia data set. | ||
+ | |||
=== Chain4 query === | === Chain4 query === | ||
Chain4 is | Chain4 is | ||
- | <code> | + | <code sparql> |
SELECT ?x1, ?x2, ?x3, ?x4, ?x5 WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 } | SELECT ?x1, ?x2, ?x3, ?x4, ?x5 WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 } | ||
</code> | </code> | ||
with properties | with properties | ||
- | <code> | + | <code scala> |
val P1 = 1389363200 | val P1 = 1389363200 | ||
val P2 = 52239 | val P2 = 52239 | ||
Ligne 30: | Ligne 144: | ||
val P4 = 1164156928 | val P4 = 1164156928 | ||
</code> | </code> | ||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:chain4| Chain4 query plans]] | ||
- | The plans produced by each method are: | + | === Chain6 query === |
- | + | <code sparql> | |
- | * SPARQL DF: | + | SELECT ?x1, ?x2, ?x3, ?x4, ?x5, ?x6, ?x7 WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 . ?x5 P5 ?x6 . ?x6 P6 ?x7 } |
- | <code> | + | |
- | val t1 = df.where(s"p=$P1").select("s","o").withColumnRenamed("s", "x1").withColumnRenamed("o", "x2") | + | |
- | val t2 = df.where(s"p=$P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3") | + | |
- | val t3 = df.where(s"p=$P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4") | + | |
- | val t4 = df.where(s"p=$P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5") | + | |
- | val res = t1.join(t2,Seq("x2")).join(t3,Seq("x3")).join(t4,Seq("x4")) | + | |
</code> | </code> | ||
- | + | with properties | |
- | * SPARQL Hybrid DF: | + | <code scala> |
- | <code> | + | val P1 = 18843 |
- | val P2 = 52239 | + | val P2 = 5540 |
- | val P3 = 1164541952 | + | val P3 = 1179222016 |
- | val P4 = 1164156928 | + | val P4 = 1446076416 |
- | val subg = df.where(s"p in ($P2, $P3, $P4)") | + | val P5 = 1446244352 |
- | subg.persist | + | val P6 = 36363 |
- | subg.count | + | |
- | val st2 = subg.where(s"p= $P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3") | + | |
- | val st3 = subg.where(s"p= $P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4") | + | |
- | val st4 = subg.where(s"p= $P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5") | + | |
- | val res = t1.join(st2,Seq("x2")).join(st3,Seq("x3")).join(st4,Seq("x4")) | + | |
- | res.count | + | |
</code> | </code> | ||
- | === Chain6 query === | + | See [[en:site:recherche:logiciels:sparqlwithspark:chain6| Chain6 query plans]] |
==== Snowflake queries ==== | ==== Snowflake queries ==== | ||
- | ==== WatDiv queries ==== | + | SPARQL for Q8 from LUBM test suite |
+ | <code sparql> | ||
+ | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> | ||
+ | PREFIX ub: <http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl#> | ||
+ | SELECT ?X, ?Y, ?Z | ||
+ | WHERE | ||
+ | {?X rdf:type ub:Student . | ||
+ | ?Y rdf:type ub:Department . | ||
+ | ?X ub:memberOf ?Y . | ||
+ | ?Y ub:subOrganizationOf <http://www.University0.edu> . | ||
+ | ?X ub:emailAddress ?Z} | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:snowflakeQ8]] | ||
+ | |||
+ | |||
+ | ===== Misc ===== | ||
+ | [[en:site:recherche:logiciels:sparqlwithspark:utility| Utility tools]] | ||
+ | |||
+ | |||
+ | |||