I’m starting to use Virtuoso’s linked data for running SPARQL queries over relational DB data, very cool!
Saw something I don’t understand. I’m running this SPARQL query (namespaces replaced to protect the innocent):
prefix gi: <http://my/ns/>
with <http://my/graph#>
select ?c {
?ol gi:order <http://my/graph/order_header/Order_ID/1#this>;
gi:productQuantityCount ?c
}
And besides the values I expect for ?c, I get an extra NULL.
The query gets translated to this SQL:
SELECT "s_0_10"."c" AS "c"
FROM (
SELECT NULL AS "c" ANY
FROM
(SELECT TOP 1 1 AS __stub FROM DB.DBA.SYS_IDONLY_ONE TABLE OPTION (ISOLATION READ UNCOMMITTED, NO __LOCK)) AS "stub-s_0_8"
UNION ALL SELECT "s_1_0_t1-qm1"."4aca556dfa4c523" AS "c"
FROM (SELECT "s_1_0_t0-c2-int~tbl6"."Order_Line_ID" AS "3b0afff137aff5c2", "s_1_0_t0-c2-int~tbl5"."Order_ID" AS "73b75e395d1b26b6" FROM "sample"."AzureSQL"."order_header" AS "s_1_0_t0-c2-int~tbl5", "sample"."AzureSQL"."order_line" AS "s_1_0_t0-c2-int~tbl6"
WHERE
("s_1_0_t0-c2-int~tbl6"."Order_ID" = "s_1_0_t0-c2-int~tbl5"."Order_ID")
AND
("s_1_0_t0-c2-int~tbl6"."Order_Line_ID" is not null)
AND
"s_1_0_t0-c2-int~tbl5"."Order_ID" = 1) AS "s_1_0_t0-c2"
INNER JOIN (SELECT "s_1_0_t1-qm1-int~tbl6"."Order_Line_ID" AS "3b0afff137aff5c2", "s_1_0_t1-qm1-int~tbl6"."Product_Quantity_Count" AS "4aca556dfa4c523" FROM "sample"."AzureSQL"."order_line" AS "s_1_0_t1-qm1-int~tbl6"
WHERE
("s_1_0_t1-qm1-int~tbl6"."Order_Line_ID" is not null)
AND
("s_1_0_t1-qm1-int~tbl6"."Product_Quantity_Count" is not null)) AS "s_1_0_t1-qm1"
ON (
"s_1_0_t0-c2"."3b0afff137aff5c2" = "s_1_0_t1-qm1"."3b0afff137aff5c2")) AS "s_0_10"
OPTION (QUIETCAST)
and the extra NULL is coming out of that first SELECT NULL AS "c"
clause up top.
If I copy the relevant triples to a regular graph, and run the same SPARQL query, I don’t get the NULL (the SQL translation is, of course, different, much shorter.)
Any ideas what might be going on? Many thanks in advance! I’m running Virtuoso Version 08.03.3332-pthreads for Linux as of Sep 12 2024 (363f8fdcf3).
I presume the SPARQL to SQL text you provide is from the Generate SPARQL compilation report
check box of the SPARQL endpoint form page with the Explain query
button ?
Using the SQL Server Northwind
sample database I Linked the Orders
and Customers
tables into a Virtuoso 8.3 instance and created transient/virtual RDF Views of them, and running the following query similar in construct to yours:
with <http://hfw.openlinksw.com:8332/Northwind#>
select ?c {
?ol <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://hfw.openlinksw.com:8332/schemas/Northwind/Orders>;
<http://hfw.openlinksw.com:8332/schemas/Northwind/customerid> ?c
}
The query executes returning a list of customer id’s none of which are NULL, that despite generating a similar SPARQL to SQL query translation to yours with the SELECT NULL AS "c" ANY
construct in it:
SELECT "s_0_10"."c" AS "c"
FROM (
SELECT NULL AS "c" ANY
FROM
(SELECT TOP 1 1 AS __stub FROM DB.DBA.SYS_IDONLY_ONE TABLE OPTION (ISOLATION READ UNCOMMITTED, NO __LOCK)) AS "stub-s_0_8"
UNION ALL SELECT "s_5_0_t1-qm1"."aa3214a4adb504f" AS "c"
FROM (SELECT "s_5_0_t0-c2-int~orders_s"."OrderID" AS "6850c5a95fe682d6" FROM "Northwind"."sqllu_uda9"."Orders" AS "s_5_0_t0-c2-int~orders_s") AS "s_5_0_t0-c2"
INNER JOIN (SELECT "s_5_0_t1-qm1-int~orders_s"."OrderID" AS "6850c5a95fe682d6", "s_5_0_t1-qm1-int~orders_s"."CustomerID" AS "aa3214a4adb504f" FROM "Northwind"."sqllu_uda9"."Orders" AS "s_5_0_t1-qm1-int~orders_s"
WHERE
("s_5_0_t1-qm1-int~orders_s"."CustomerID" is not null)) AS "s_5_0_t1-qm1"
ON (
"s_5_0_t0-c2"."6850c5a95fe682d6" = "s_5_0_t1-qm1"."6850c5a95fe682d6")) AS "s_0_10"
OPTION (QUIETCAST)
So that is not what is causing the NULL
value you are seeing I would think.
Could there be a NULL
value in the actual remote SQL data being queried some where ?
If you are able to provide a sample test case against the SQL Server Northwind
database then we could attempt to recreate in-house ?
I got the SQL from running string_to_file ('debug.sql', sparql_to_sql_text (<query), -2);
.
I concluded that the NULL was coming from the first clause because when I run a partial SQL query that only contains that part of the UNION, I get a single NULL result (the partial query is below.)
FROM (
SELECT NULL AS "c" ANY
FROM
(SELECT TOP 1 1 AS __stub FROM DB.DBA.SYS_IDONLY_ONE
TABLE OPTION (ISOLATION READ UNCOMMITTED, NO __LOCK)) AS "stub-s_0_8"
) AS "s_0_10" OPTION (QUIETCAST);
Forgot to say: once again, thanks for a super quick reply (on the weekend even)!
A loaded
question, via which Virtuoso interface are you running the SPARQL query when seeing this NULL
value, as you seem to like doing things via the SQL interface ???
As I have been running the SPARQL query from the the Virtuoso SPARQL endpoint (ie http:hostname:portno.sparql) and a NULL
value is not returned.
But when I run it from the Virtuoso isql
command line tool the NULL
value is returned:
SQL> SPARQL with <http://hfw.openlinksw.com:8332/Northwind#> select distinct ?c { ?ol <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://hfw.openlinksw.com:8332/schemas/Northwind/Orders>; <http://hfw.openlinksw.com:8332/schemas/Northwind/customerid> ?c };
c
LONG VARCHAR
_______________________________________________________________________________
NULL
ALFKI
ANATR
ANTON
...
WHITC
WILMK
WOLZA
90 Rows. -- 1415 msec.
SQL>
So, I can recreate the problem with via the SQL interface, but it does not occur when executed via the SPARQL endpoint.
I am going to check this behaviour with development …
You’re right, I was using isql! I guess that makes it less of a practical issue, since production access is going to be over http.
The issue will be fixed regardless, as executing SPARQL via SQL (ie SPASQL) is an equally important means of accessing RDF data in the Virtuoso RDF Quad store as the RestFul SPARQL endpoint.
Confirmed: not getting the extra NULL via the HTTP Sparql interface. Thanks again for your help!