Bases de Données / Databases

Site Web de l'équipe BD du LIP6 / LIP6 DB Web Site

Outils pour utilisateurs

Outils du site


site:recherche:logiciels:sparqlwithspark

Ceci est une ancienne révision du document !


SPARQL query processing with Apache Spark

This web page is a companion to the “SPARQL query processing with Apache Spark” paper submitted at EDBT 2017.

It provides access to some resources related to the evaluation section.

Data sets

Query processing

Star queries

Chain queries

Chain queries over DBPedia data set.

Chain4 query

Chain4 is

SELECT ?x1, ?x2, ?x3, ?x4, ?x5 
WHERE { ?x1 p1 ?x2 . ?x2 p2 ?x3 . ?x3 p3 ?x4 . ?x4 p4 ?x5 }
val P1 = 1389363200
val P2 = 52239
val P3 = 1164541952
val P4 = 1164156928

The SPARQL DF method produces the plan:

val t1 = df.where(s"p=$P1").select("s","o").withColumnRenamed("s", "x1").withColumnRenamed("o", "x2")
val t2 = df.where(s"p=$P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3")
val t3 = df.where(s"p=$P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4")
val t4 = df.where(s"p=$P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5")
val res = t1.join(t2,Seq("x2")).join(t3,Seq("x3")).join(t4,Seq("x4"))

Using the SPARQL Hybrid DF method:

val P2 = 52239
val P3 = 1164541952
val P4 = 1164156928
val subg = df.where(s"p in ($P2, $P3, $P4)")
subg.persist
subg.count
val st2 = subg.where(s"p= $P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3")
val st3 = subg.where(s"p= $P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4")
val st4 = subg.where(s"p= $P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5")
val res = t1.join(st2,Seq("x2")).join(st3,Seq("x3")).join(st4,Seq("x4"))
res.count

Chain6 query

Snowflake queries

WatDiv queries

site/recherche/logiciels/sparqlwithspark.1473781766.txt.gz · Dernière modification: 13/09/2016 17:49 par hubert