Ceci est une ancienne révision du document !
This web page is a companion to the “SPARQL query processing with Apache Spark” paper submitted at EDBT 2017.
It provides access to some resources related to the evaluation section.
Chain queries over DBPedia data set.
Chain4 is
SELECT ?x1, ?x2, ?x3, ?x4, ?x5 WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 }
with properties
val P1 = 1389363200 val P2 = 52239 val P3 = 1164541952 val P4 = 1164156928
The plans produced by each method are:
val t1 = df.where(s"p=$P1").select("s","o").withColumnRenamed("s", "x1").withColumnRenamed("o", "x2") val t2 = df.where(s"p=$P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3") val t3 = df.where(s"p=$P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4") val t4 = df.where(s"p=$P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5") val res = t1.join(t2,Seq("x2")).join(t3,Seq("x3")).join(t4,Seq("x4"))
val P2 = 52239 val P3 = 1164541952 val P4 = 1164156928 val subg = df.where(s"p in ($P2, $P3, $P4)") subg.persist subg.count val st2 = subg.where(s"p= $P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3") val st3 = subg.where(s"p= $P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4") val st4 = subg.where(s"p= $P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5") val res = t1.join(st2,Seq("x2")).join(st3,Seq("x3")).join(st4,Seq("x4")) res.count