Ci-dessous, les différences entre deux révisions de la page.
Les deux révisions précédentes Révision précédente Prochaine révision | Révision précédente | ||
site:recherche:logiciels:sparqlwithspark [13/09/2016 18:32] hubert |
site:recherche:logiciels:sparqlwithspark [17/11/2023 18:39] (Version actuelle) amann |
||
---|---|---|---|
Ligne 1: | Ligne 1: | ||
====== SPARQL query processing with Apache Spark ====== | ====== SPARQL query processing with Apache Spark ====== | ||
- | Go to [[en:site:recherche:logiciels:sparqlwithspark| SPARQL Query Processing with Apache Spark]] (in english) | + | This wiki is a companion to the following publications: |
+ | * [[https://hal.archives-ouvertes.fr/hal-01502519|SPARQL Graph Pattern Processing with Apache Spark]] | ||
+ | * [[https://arxiv.org/abs/1604.08903|SPARQL query processing with Apache Spark]] | ||
+ | * [[https://hal.archives-ouvertes.fr/hal-01214900|HAQWA: a Hash-based and Query Workload Aware Distributed RDF Store]] | ||
+ | * [[https://hal.archives-ouvertes.fr/hal-01214902|On the Evaluation of RDF Distribution Algorithms Implemented over Apache Spark]] | ||
+ | |||
+ | It provides access to the resources related to the evaluation section of [[https://arxiv.org/abs/1604.08903|SPARQL query processing with Apache Spark]]. | ||
+ | |||
+ | See also [[en:site:recherche:logiciels:rdfdist]] concerning RDF distribution approaches using Spark. | ||
+ | |||
+ | ===== Data sets ===== | ||
+ | * DrugBank | ||
+ | * DBPedia | ||
+ | * LUBM: LU100M, LU1B | ||
+ | * WatDiv: see [[en:site:recherche:logiciels:sparqlwithspark:datasetWatdiv]] | ||
+ | |||
+ | |||
+ | ===== Query processing ===== | ||
+ | |||
+ | ==== WatDiv queries ==== | ||
+ | |||
+ | |||
+ | === Query S1 === | ||
+ | <code sparql> | ||
+ | SELECT ?v0 ?v1 ?v3 ?v4 ?v5 ?v6 ?v7 ?v8 ?v9 WHERE { | ||
+ | ?v0 gr:includes ?v1 . %v2% gr:offers ?v0 . | ||
+ | ?v0 gr:price ?v3 . ?v0 gr:serialNumber ?v4 . | ||
+ | ?v0 gr:validFrom ?v5 . ?v0 gr:validThrough ?v6 . | ||
+ | ?v0 sorg:eligibleQuantity ?v7 . | ||
+ | ?v0 sorg:eligibleRegion ?v8 . | ||
+ | ?v0 sorg:priceValidUntil ?v9 . } | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:watDivS1]] | ||
+ | |||
+ | === Query F5 === | ||
+ | <code sparql> | ||
+ | SELECT ?v0 ?v1 ?v3 ?v4 ?v5 ?v6 WHERE { | ||
+ | ?v0 gr:includes ?v1 . %v2% gr:offers ?v0 . | ||
+ | ?v0 gr:price ?v3 . ?v0 gr:validThrough ?v4 . | ||
+ | ?v1 og:title ?v5 . ?v1 rdf:type ?v6 . } | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:watDivF5]] | ||
+ | == Execution reports for F5 == | ||
+ | |||
+ | ^Plan ^ Execution report ^ | ||
+ | |SPARQL DF | {{:en:site:recherche:logiciels:f5_plan_sparql_df.png?300|SPARQL DF}} | | ||
+ | |SPARQL Hybrid| {{:en:site:recherche:logiciels:f5_plan_sparql_hybrid.png?300|SPARQL Hybrid}} | | ||
+ | |S2RDF | {{:en:site:recherche:logiciels:f5_plan_s2rdf.png?300|S2RDF}} | | ||
+ | |S2RDF+Hybrid | {{:en:site:recherche:logiciels:f5_plan_s2rdf_hybrid.png?300|S2RDF+Hybrid}} | | ||
+ | |||
+ | |||
+ | === Query C3 === | ||
+ | <code sparql> | ||
+ | SELECT ?v0 WHERE { | ||
+ | ?v0 wsdbm:likes ?v1 . ?v0 wsdbm:friendOf ?v2 . | ||
+ | ?v0 dc:Location ?v3 . ?v0 foaf:age ?v4 . | ||
+ | ?v0 wsdbm:gender ?v5 . ?v0 foaf:givenName ?v6 . } | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:watDivC3]] | ||
+ | |||
+ | |||
+ | ==== Star queries ==== | ||
+ | Star queries over the DrugBank dataset | ||
+ | |||
+ | Star with 3 branches | ||
+ | |||
+ | <code sparql> | ||
+ | SELECT ?x ?a ?b | ||
+ | WHERE { | ||
+ | ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b.} | ||
+ | </code> | ||
+ | |||
+ | Star with 5 branches | ||
+ | <code sparql> | ||
+ | SELECT ?x ?a ?b ?c ?d | ||
+ | WHERE { | ||
+ | ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d . | ||
+ | } | ||
+ | </code> | ||
+ | |||
+ | Star with 10 branches | ||
+ | <code sparql> | ||
+ | SELECT ?x ?a ?b ?c ?d ?g ?h ?i | ||
+ | WHERE { | ||
+ | ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/pharmacology> ?e. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/mechanismOfAction> ?f | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/predictedLogs> ?g . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/halfLife> ?h . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/dpdDrugIdNumber> ?i . | ||
+ | } | ||
+ | </code> | ||
+ | |||
+ | Star with 15 branches | ||
+ | <code sparql> | ||
+ | SELECT ?x ?a ?b ?c ?d ?g ?h ?i ?j ?k ?l | ||
+ | WHERE { | ||
+ | ?x <http://xmlns.com/foaf/0.1/page> <http://dbpedia.org/page/Ibuprofen>. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/chebiId> ?a . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/casRegistryNumber> ?b . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggDrugId> ?c . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/keggCompoundId> ?d . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/pharmacology> ?e. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/mechanismOfAction> ?f | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/predictedLogs> ?g . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/halfLife> ?h . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/dpdDrugIdNumber> ?i . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/contraindicationInsert> ?j . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/interactionInsert> ?k . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/structure> ?l. | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/state> ?m . | ||
+ | ?x <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/rxlistLink> <http://www.rxlist.com/cgi/generic/ibup.htm> .} | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:star]] | ||
+ | ==== Chain queries ==== | ||
+ | |||
+ | Chain queries over DBPedia data set. | ||
+ | |||
+ | === Chain4 query === | ||
+ | Chain4 is | ||
+ | <code sparql> | ||
+ | SELECT ?x1, ?x2, ?x3, ?x4, ?x5 WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 } | ||
+ | </code> | ||
+ | with properties | ||
+ | <code scala> | ||
+ | val P1 = 1389363200 | ||
+ | val P2 = 52239 | ||
+ | val P3 = 1164541952 | ||
+ | val P4 = 1164156928 | ||
+ | </code> | ||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:chain4| Chain4 query plans]] | ||
+ | |||
+ | === Chain6 query === | ||
+ | <code sparql> | ||
+ | SELECT ?x1, ?x2, ?x3, ?x4, ?x5, ?x6, ?x7 WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 . ?x5 P5 ?x6 . ?x6 P6 ?x7 } | ||
+ | </code> | ||
+ | with properties | ||
+ | <code scala> | ||
+ | val P1 = 18843 | ||
+ | val P2 = 5540 | ||
+ | val P3 = 1179222016 | ||
+ | val P4 = 1446076416 | ||
+ | val P5 = 1446244352 | ||
+ | val P6 = 36363 | ||
+ | </code> | ||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:chain6| Chain6 query plans]] | ||
+ | |||
+ | ==== Snowflake queries ==== | ||
+ | |||
+ | SPARQL for Q8 from LUBM test suite | ||
+ | <code sparql> | ||
+ | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> | ||
+ | PREFIX ub: <http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl#> | ||
+ | SELECT ?X, ?Y, ?Z | ||
+ | WHERE | ||
+ | {?X rdf:type ub:Student . | ||
+ | ?Y rdf:type ub:Department . | ||
+ | ?X ub:memberOf ?Y . | ||
+ | ?Y ub:subOrganizationOf <http://www.University0.edu> . | ||
+ | ?X ub:emailAddress ?Z} | ||
+ | </code> | ||
+ | |||
+ | See [[en:site:recherche:logiciels:sparqlwithspark:snowflakeQ8]] | ||
+ | |||
+ | |||
+ | ===== Misc ===== | ||
+ | [[en:site:recherche:logiciels:sparqlwithspark:utility| Utility tools]] | ||
+ | |||
+ | |||
+ | |||
+ |