Bases de Données / Databases

Site Web de l'équipe BD du LIP6 / LIP6 DB Web Site

Outils pour utilisateurs

Outils du site


site:recherche:logiciels:sparqlwithspark

Ceci est une ancienne révision du document !


SPARQL query processing with Apache Spark

This web page is a companion to the “SPARQL query processing with Apache Spark” paper submitted at EDBT 2017.

It provides access to some resources related to the evaluation section.

Data sets

  • DrugBank
  • DBPedia
  • LUBM
  • WatDiv

Query processing

Star queries

Chain queries

Chain queries over DBPedia data set.

Chain4 query

Chain4 is

SELECT ?x1, ?x2, ?x3, ?x4, ?x5  WHERE { ?x1 P1 ?x2 . ?x2 P2 ?x3 . ?x3 P3 ?x4 . ?x4 P4 ?x5 }

with properties

val P1 = 1389363200
val P2 = 52239
val P3 = 1164541952
val P4 = 1164156928

The plans produced by each method are:

  • SPARQL DF:
val t1 = df.where(s"p=$P1").select("s","o").withColumnRenamed("s", "x1").withColumnRenamed("o", "x2")
val t2 = df.where(s"p=$P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3")
val t3 = df.where(s"p=$P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4")
val t4 = df.where(s"p=$P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5")
val res = t1.join(t2,Seq("x2")).join(t3,Seq("x3")).join(t4,Seq("x4"))
  • SPARQL Hybrid DF:
val P2 = 52239
val P3 = 1164541952
val P4 = 1164156928
val subg = df.where(s"p in ($P2, $P3, $P4)")
subg.persist
subg.count
val st2 = subg.where(s"p= $P2").select("s","o").withColumnRenamed("s", "x2").withColumnRenamed("o", "x3")
val st3 = subg.where(s"p= $P3").select("s","o").withColumnRenamed("s", "x3").withColumnRenamed("o", "x4")
val st4 = subg.where(s"p= $P4").select("s","o").withColumnRenamed("s", "x4").withColumnRenamed("o", "x5")
val res = t1.join(st2,Seq("x2")).join(st3,Seq("x3")).join(st4,Seq("x4"))
res.count

Chain6 query

Snowflake queries

WatDiv queries

site/recherche/logiciels/sparqlwithspark.1473782027.txt.gz · Dernière modification: 13/09/2016 17:53 par hubert