Transitivity in SPARQL
Virtuoso SPARQL allows access to Virtuoso’s SQL transitivity extension (read the SQL Transitivity Note for details about this feature).
The syntax used in SPARQL is slightly different from the syntax used in SQL, although the option names and meanings are the same.
In SPARQL, the transitive options occur after a subquery enclosed in braces.
The query below produces all the IRI’s that are the same as <http://dbpedia.org/resource/New_York>
:
SPARQL
SELECT ?syn
WHERE
{
{
SELECT ?x
?syn
WHERE
{
{ ?x owl:sameAs ?syn }
UNION
{ ?syn owl:sameAs ?x }
}
}
OPTION ( TRANSITIVE ,
T_IN ( ?x ) ,
T_OUT ( ?syn) ,
T_DISTINCT ,
T_MIN ( 0)
)
FILTER ( ?x = <http://dbpedia.org/resource/New_York> ) .
}
In this case, we provide a binding for ?x
in the FILTER
outside of the transitive subquery. The subquery therefore is made to run from in to out. The same effect would be accomplished if we bound ?syn
and SELECT ?x
. The designations of in and out are arbitrary, and for transitive steps that can be evaluated equally well in both directions, this makes no difference.
The transitive subquery in the query above is shown below:
SELECT ?syn
WHERE
{
{ SELECT ?x
?syn
WHERE
{
{ ?x owl:sameAs ?syn }
UNION
{ ?syn owl:sameAs ?x}
}
}
OPTION ( TRANSITIVE ,
T_IN ( ?x ) ,
T_OUT ( ?syn ) ,
T_DISTINCT ,
T_MIN ( 0 ) )
}
}
If the OPTION
were left out, it would just look for one step of owl:sameAs
. Making it TRANSITIVE
will apply the subquery to all bindings produced until all are visited at least once (the T_DISTINCT
modifier).
If the TRANSITIVE
step consists of a single triple pattern, there is a shorthand:
<alice> foaf:knows ?friend
OPTION ( TRANSITIVE
T_MIN (1)
)
This will bind ?friend
to all directly and indirectly found foaf:known
individuals. If T_MIN
had been 0
, <alice>
would have also been in the generated bindings.
The syntax is:
OPTION (TRANSITIVE transitivity_option [,...] )
transitivity_option ::= T_IN ( <variable_list> )
| T_OUT ( <variable_list> )
| T_DISTINCT
| T_SHORTEST_ONLY
| T_NO_CYCLES
| T_CYCLES_ONLY
| T_MIN ( intnum )
| T_MAX ( intnum )
| T_END_FLAG ( <variable> )
| T_STEP ( <variable_or_step> )
| T_DIRECTION intnum
variable_list ::= <variable> [,...]
variable_or_step ::= <variable> | 'path_id' | 'step_no'
Where column numbers are used in SQL, variable names are used in SPARQL. Otherwise all the options have the same meaning.
Some examples of the use of transitivity follow.
Collection of Transitivity Option Demo Queries for SPARQL
Example 1: owl:sameAs
for <http://dbpedia.org/resource/New_York>
This query finds all graphs that contain owl:sameAs
for <http://dbpedia.org/resource/New_York>
.
Here we select all paths that start with the initial URI and pass through one or more owl:sameAs
statements. Each step produces a result of the transitive subquery. The graph where the owl:sameAs
triple was found is returned and used as the grouping column. In this way, we see how many times each graph is used. Note that graphs are counted many times since the graphs containing immediate owl:sameAs
statements are counted for paths of length 1, then again as steps on paths that reach to their aliases and so on.
SELECT ?g
?x
( COUNT (*) AS ?count )
WHERE
{
{
SELECT ?x
?alias
?g
WHERE
{
{
GRAPH ?g
{ ?x owl:sameAs ?alias }
}
UNION
{
GRAPH ?g
{?alias owl:sameAs ?x}
}
}
}
OPTION ( TRANSITIVE,
T_IN ( ?x ),
T_OUT ( ?alias ),
T_DISTINCT,
T_MIN ( 1 )
) .
FILTER ( ?x = <http://dbpedia.org/resource/New_York> ) .
}
Example 2: people known by Tim Berners-Lee
This query takes all the people known by Tim Berners-Lee, to a depth between 1 and 4 applications of the subquery. It then sorts them by the distance and the descending count of connections of each found connection. This is equivalent to the default connections list shown by LinkedIn.
SPARQL
SELECT ?o
?dist
( ( SELECT ( COUNT (*) AS ?cnt )
WHERE { ?o foaf:knows ?xx }
) )
WHERE
{
{
SELECT ?s
?o
WHERE
{
?s foaf:knows ?o
}
}
OPTION ( TRANSITIVE ,
T_DISTINCT ,
T_IN ( ?s ) ,
T_OUT ( ?o ) ,
T_MIN ( 1 ) ,
T_MAX ( 4 ) ,
T_STEP ( 'step_no' ) AS ?dist ) .
FILTER ( ?s = <http://www.w3.org/People/Berners-Lee/card#i> )
}
ORDER BY ?dist DESC 3
LIMIT 50
Example 3: people known by Tim Berners-Lee (modified)
The following query takes all the people known by <http://www.w3.org/People/Berners-Lee/card#i>
, to a depth between 2 and 4 applications of the subquery. It then sorts them by the distance and the descending count of connections of each found connection. This is equivalent to the default connections list shown by LinkedIn.
SPARQL
SELECT ?o
?dist
( ( SELECT ( COUNT (*) as ?cnt )
WHERE { ?o foaf:knows ?xx }
) )
WHERE
{
{
SELECT ?s ?o
WHERE
{
?s foaf:knows ?o
}
}
OPTION ( TRANSITIVE ,
T_DISTINCT ,
T_IN ( ?s ) ,
T_OUT ( ?o ) ,
T_MIN ( 2 ) ,
T_MAX ( 4 ) ,
T_STEP ( 'step_no' ) AS ?dist) .
FILTER ( ?s = <http://www.w3.org/People/Berners-Lee/card#i> )
}
ORDER BY ?dist DESC 3
LIMIT 50
Example 4: How do Tim Berners-Lee and Miguel Paraz Know Each Other?
The next query finds how two people know each other and what graphs are involved in the connection.
This query binds both the T_IN
and T_OUT
variables. The ?g
is left as a free variable. Also, specifying ?s
and the system defined constants step_no
and path_id
as with t_step
, we get for each transitive step a row of results with the intermediate binding of ?s
, the count of steps from the initial ?s
and a distinct identifier for the individual path, since there can be many distinct paths that link the ?s
and ?o
specified in the filter.
SPARQL
SELECT ?link
?g
?step
?path
WHERE
{
{
SELECT ?s
?o
?g
WHERE
{
GRAPH ?g
{ ?s foaf:knows ?o }
}
}
OPTION ( TRANSITIVE ,
T_DISTINCT ,
T_IN ( ?s ) ,
T_OUT ( ?o ) ,
T_NO_CYCLES ,
T_SHORTEST_ONLY ,
T_STEP ( ?s ) AS ?link ,
T_STEP ( 'path_id' ) AS ?path ,
T_STEP ( 'step_no' ) AS ?step ,
T_DIRECTION 3) .
FILTER ( ?s = <http://www.w3.org/People/Berners-Lee/card#i>
&& ?o = <http://www.advogato.org/person/mparaz/foaf.rdf#me>
)
}
LIMIT 20
Related
- Community Forum Note about the Transitivity Extension for SQL
-
SQL and Transitivity Documentation – this is where
step_no
andpath_id
are explained - SPARQL and Transitivity Documentation