Hello,
I am trying to analyze the query plan of the following query:
SELECT (COUNT(*) AS ?total) FROM <http://www.wikidata.org/> WHERE{
?x2 <http://www.wikidata.org/prop/direct/P106> <http://www.wikidata.org/entity/Q81096> .
?x2 <http://www.wikidata.org/prop/direct/P19> ?x3 . } GROUP BY ?x2
Firstly, I activate the query profile by executing the command “prof_enable (1);” on the isql.
Then I run the query directly to isql and I also try the same query using java JDBC.
String urlDB = "jdbc:virtuoso://" +host+ ":" + port;
Connection conn = DriverManager.getConnection(urlDB, user, psw);
String command = "sparql " + query;
st = conn.createStatement();
ResultSet result = st.executeQuery(command);
ok = result.next();
st.close();
Finally, I am running the following command in the isql:
SELECT ql_id, ql_text, ql_plan FROM sys_query_log WHERE qrl_file = 'virtuoso.qrl' AND qrl_start_dt = cast ('2023-01-18' as datetime);
In the first case with isql I get this:
34 #line 17 "(console)"
sparql SELECT (COUNT(*) AS ?total) FROM <http://www.wikidata.org/> WHERE{ ?x2 <http://www.wikidata.org/prop/direct/P106> <http://www.wikidata.org/entity/Q81096> . ?x2 <http://www.wikidata.org/prop/direct/P19> ?x3 . } GROUP BY ?x2 {
time 0.017% fanout 1 input 1 rows
Subquery 27
{
time 0.016% fanout 1 input 1 rows
fork {
time 5% fanout 39783 input 1 rows
RDF_QUAD_POGS 4e+04 rows(s_1_2_t0.S$31)
P = IRI_ID"...direct/P106" , O = IRI_ID"...Q81096" G = IRI_ID"..."
time 46% fanout 0.612875 input 39783 rows
RDF_QUAD 1 rows()
inlined P = IRI_ID"...direct/P19" , S = s_1_2_t0.S$31 G = IRI_ID"..."
time 21% fanout 0 input 24382 rows
Sort (s_1_2_t0.S$31) -> (inc$39)
}
time 6% fanout 24256 input 1 rows
group by read node
(s_1_2_t0.S$31, aggregate$37)
After code:
0: total$28 := := artm aggregate$37
4: BReturn 0
time 0.029% fanout 0 input 24256 rows
Subquery Select(total$28)
}
After code:
0: total$69 := Call __ro2sq (total$28)
5: BReturn 0
time 22% fanout 0 input 24256 rows
Select (total$69)
}
While in the second case through java I get this:
25 sparql SELECT (COUNT(*) AS ?total) FROM <http://www.wikidata.org/> WHERE{ ?x2 <http://www.wikidata.org/prop/direct/P106> <http://www.wikidata.org/entity/Q81096> . ?x2 <http://www.wikidata.org/prop/direct/P19> ?x3 . } GROUP BY ?x2
{
time 0.024% fanout 1 input 1 rows
Subquery 27
{
time 0.02% fanout 1 input 1 rows
fork {
time 7% fanout 39783 input 1 rows
RDF_QUAD_POGS 4e+04 rows(s_2_2_t0.S$31)
P = IRI_ID"...direct/P106" , O = IRI_ID"...Q81096" G = IRI_ID"..."
time 63% fanout 0.612875 input 39783 rows
RDF_QUAD 1 rows()
inlined P = IRI_ID"...direct/P19" , S = s_2_2_t0.S$31 G = IRI_ID"..."
time 29% fanout 0 input 24382 rows
Sort (s_2_2_t0.S$31) -> (inc$39)
}
time 0.55% fanout 2000 input 1 rows
group by read node
(s_2_2_t0.S$31, aggregate$37)
After code:
0: total$28 := := artm aggregate$37
4: BReturn 0
time 0.011% fanout 0 input 2000 rows
Subquery Select(total$28)
}
After code:
0: total$69 := Call __ro2sq (total$28)
5: BReturn 0
time 0.15% fanout 0 input 2000 rows
Select (total$69)
I am trying to understand why these two query plans are different. For example, there is this line regarding the returned results in the isql " time 22% fanout 0 input 24256 rows" while running through java the query plan contains “time 0.15% fanout 0 input 2000 rows”.