Faceted Browser (FCT) Aggregates and Filters

Hi,

This thread is a continuation of this mis-posted issue on Github.

I want the number of distinct occurrences of ?s1 as the object of dbpedia:supports, where ?s1 has the keywords 'batman' as the value of one or more of its properties. Am I correct that both of the following queries express these semantics:

There are 9000 records that use ?s1 as dbpedia:supports:
SPARQL
Facets (see the “supports” item under the Used As column

When these records are the subject, the number is 34:
SPARQL
Facets

If they are different, I would be grateful to know how they differ. I notice that the triple ?s1o ?s1ip ?s1 was added to the first query, but doesn’t filter the URI value of ?s1ip to dbpedia:supports.

(Note, both of the above SPARQL queries were generated by FCT service, just proxy the traffic on the Facets links above if you wish to inspect).

-sherman

Likewise, when I filter the class, I expect to see 34 Work (the count next to Records and the count next to Works should match), but there are instead 171 Works. Same pattern of missing filter on the subject in the SPARQL query.

The Virtuoso Universal Server -Enterprise Edition- BYOL AMI-8-2-0-ami appears to be affected by this issue, i.e., the FCT results are not filtered, and the filter clause appears to be left out of query, and is prepended to the XML document returned in the FCT response. E.g.

HTTP/1.1 200 OK
Server: nginx/1.14.0 (Ubuntu)
Date: Wed, 27 Feb 2019 14:58:15 GMT
Content-Type: text/xml; charset=UTF-8
Connection: close
X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
accept-ranges: bytes
Content-Length: 8067

 filter (?s1 = <http://xmlns.com/foaf/0.1/Person>) .<fct:facets xmlns:fct="http://openlinksw.com/services/facets/1.0/">
<fct:sparql>     select ?s1 as ?c1 count (*) as ?c2 where { select distinct ?s1  {?s1 a &lt;http://www.w3.org/2002/07/owl#Class&gt; . } } group by ?s1 order by desc 2 limit 30  offset 0 </fct:sparql>
<fct:time>42</fct:time>
<fct:complete>yes</fct:complete>
<fct:timeout>15520</fct:timeout>
<fct:db-activity> 2.771K rnd  10.37K seq   1.74K same seg     120 same pg      0 same par      0 disk      0 spec disk      0B /      0 messages      0 fork</fct:db-activity>
 <fct:result type="list-count">
  <fct:row>

Please share links to the following to simplify understanding of your problem:

  1. Faceted Browser (FCT) Page – click on “permalink” to get the canonical page URL
  2. SPARQL Query produced by the FCT page – click on “SPARQL” top get that, then share the URL

My suspicion is that the UI is cloaking the effects of multiple named graphs that are used to construct the default graph used to produce the solution.

/cc @imitko @hwilliams

Kingsley,

Here is the FCT query, and the SPARQL query in Virtuoso. The results there are valid. Here is the query on the live PoC. Here is the SPARQL text from Virtuoso:

 select ?s1 as ?c1, ( bif:search_excerpt ( bif:vector ( 'Person' ) , ?o1 ) ) as ?c2, ?sc, ?rank, ?g where 
  { 
    { 
      { 
        select ?s1, ( ?sc * 3e-1 ) as ?sc, ?o1, ( sql:rnk_scale ( <LONG::IRI_RANK> ( ?s1 ) ) ) as ?rank, ?g where 
        { 
          quad map virtrdf:DefaultQuadMap 
          { 
            graph ?g 
            { 
              ?s1 ?s1textp ?o1 .
              ?o1 bif:contains '"Person"' option ( score ?sc ) .
              
            }
           }
         ?s1 a <http://www.w3.org/2002/07/owl#Class> .
          filter ( ?s1 = <http://xmlns.com/foaf/0.1/Person> ) .
          
        }
       order by desc ( ?sc * 3e-1 + sql:rnk_scale ( <LONG::IRI_RANK> ( ?s1 ) ) ) limit 20 offset 0 
      }
     }
   }

And here is the SPARQL text received from FCT by the live PoC:

HTTP/1.1 200 OK
Server: nginx/1.14.0 (Ubuntu)
Date: Wed, 27 Feb 2019 17:27:47 GMT
Content-Type: text/xml; charset=UTF-8
Connection: close
X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
accept-ranges: bytes
Content-Length: 8703

 filter (?s1 = <http://xmlns.com/foaf/0.1/Person>) .<fct:facets xmlns:fct="http://openlinksw.com/services/facets/1.0/">
<fct:sparql>     select ?s1 as ?c1 count (*) as ?c2 where { select distinct ?s1 ?g  {?s1 a &lt;http://www.w3.org/2002/07/owl#Class&gt; . quad map virtrdf:DefaultQuadMap { graph ?g {  ?s1 ?s1textp ?o1 . ?o1 bif:contains  &#39;&quot;Person&quot;&#39;  . } }  } } group by ?s1 order by desc 2 limit 30  offset 0 </fct:sparql>
<fct:time>1022</fct:time>
<fct:complete>yes</fct:complete>
<fct:timeout>14260</fct:timeout>
...

Ignoring the malformed xml (which I think it implying the problem, i.e. the filter is not being included/applied to the query), the two SPARQL queries are not equal in syntax. But the criteria are the same for both, e.g. search all fields for text 'Person', rdf:type == owl:Class, and subject IRI == foaf:Person.

Hi, could we have a http recording of the request?
In this way we can reproduce the bad xml.

Are you saying that this doesn’t happen when using the LOD or DBpedia instances?

/cc @hwilliams @imitko

@Sherman_Monroe: Note here are the details on creating a Virtuoso HTTP Recording … /cc @imitko @kidehen

Hi Mitko,

Here is the request/response.

-sherman

Hi Hugh,

Okay, I will generate a recording using this method also.

-sherman

Hi Kingsley,

This is correct. It’s only in the FCT interaction from clients other than the /fct web interface.

-sherman

Okay, what is the version of the FCT package installed? You more than likely need a later VAD, assuming you obtained yours from our download service.

The footer section of any FCT page reveals the version in use. Thus, if yours differs from a system that doesn’t exhibit these issues then you will need to upgrade.

/cc @hwilliams

LOD does not exhibit this issue, and it’s version is v1.13.71. Our server does, and its version is v1.13.70. I’ll update to the latest VAD…

I upgraded to FCT VAD version 1.17_git13, and the issue persists.

URI Burner has the same issue.

curl -X POST http://poc.vios.network/proxy/-start-http://data.vios.network-end-/fct/service --data '&lt;query timeout="16810"&gt;&lt;class iri="http://www.w3.org/2002/07/owl#Class"&gt;&lt;/class&gt;&lt;text property="http://www.w3.org/2000/01/rdf-schema#label"&gt;Person&lt;/text&gt;&lt;value datatype="uri"&gt;http://xmlns.com/foaf/0.1/Person&lt;/value&gt;&lt;view limit="15" type="list-count" offset="0"&gt;&lt;/view&gt;&lt;/query&gt;' -HContent-Type:text/xml -v

does anybody knows why this reply with 400 error?

The proxy URL has changed from:

http://poc.vios.network/proxy/-start-http://data.vios.network-end-/fct/service

to:

http://poc.vios.network/proxy/http/data.vios.network/fct/service

Also, sometimes the Virtuoso instance at data.vios.network goes down and needs to be restarted. If you receive a bad response, please check if data.vios.network is up. I can restart it if it is down.

do we know what is the reason Virtuoso server to go down?

do we know what is the reason Virtuoso server to go down?

No, I don’t know why. Virtuoso spontaneously gives the message “The site is currently down for maintenance”. Sometimes restarting Virtuoso instance solves it, and sometimes the shutdown ends with error and the physical server has to be restarted.