Virtuoso 22023 Error RDFXX: The HTTPS retrieval is not supported via proxy

I’m working with a virtuoso-opensource cloned from github and compiled on last Friday. The software will be used in a course.

The university proxy resolves any (HTTP or HTTPS) requirement ok.

Any query including a reference to dbpedia (in example), always returns

Virtuoso 22023 Error RDFXX: The HTTPS retrieval is not supported via proxy

This is the test query:

define get:proxy "proxy.fing.edu.uy:3128"
define get:soft "soft"
SELECT *
FROM  <http://dbpedia.org/resource/Uruguay>
WHERE { ?s ?p ?o } 

Some idea about how this problem can be resolved?
Thanks.
FDO
PD: Excuse my English :frowning:

(Also asked on Stack Overflow.)

Can you please try setting the following in the [HTTP Server] section of the virtuoso.ini config file:

HTTPProxyServer  = proxy.fing.edu.uy:3128

as detailed in the http_get documentation, restart your Virtuoso instance and retry the query.

Hello. Yes it is the setting in the virtuoso.ini. Also, the sponger works ok in case of http. Only fails in https.

FDO.

[HTTPServer]
ServerPort                  = 8890
ServerRoot                  = /var/lib/virtuoso/vsp
MaxClientConnections        = 10
DavRoot                     = DAV
EnabledDavVSP               = 0
HTTPProxyEnabled            = 1
TempASPXDir                 = 0
DefaultMailServer           = localhost:25
ServerThreads               = 10
MaxKeepAlives               = 10
KeepAliveTimeout            = 10
MaxCachedProxyConnections   = 10
ProxyConnectionCacheTimeout = 15
HTTPThreadSize              = 280000
HttpPrintWarningsInOutput   = 0
Charset                     = UTF-8
HTTPLogFile                 = /var/log/virtuoso/http09082021.log
MaintenancePage             = atomic.html
EnabledGzipContent          = 1
HTTPProxyServer             = proxy.fing.edu.uy:3128
HTTPProxyExceptions         = localhost,127.0.0.1

A patch has been committed to the open source develop/7 branch that should resolve this issue. Thus can you please rebuild with the commit and try again …

I’ll try that.
Thanks
FDO.

Hello again.
Excuse me because of my delay.

I have tested with the new commit.

The error still is here.

Some data about the server and the compilation and config:

  • The HTTP Server config is the same as I posted prevously.
  • The virtuoso was installed building an rpm with the following configure line:
./configure --prefix=/usr --localstatedir=/var --enable-maintainer-mode --disable-bpel-vad --enable-xml --enable-openssl --enable-openldap --enable-conductor-va
d --enable-fct-vad --enable-rdfmappers-vad --enable-rdb2rdf-vad --disable-dbpedia-vad --disable-demo-vad --enable-isparql-vad --enable-ods-vad --disable-sparqld
emo-vad --disable-syncml-vad --disable-tutorial-vad --sysconfdir=/etc --with-readline --program-transform-name="s/isql/isql-v/"
  • The proxy address is a Squid configured with load balance, so each time can give a different address.

  • In the following error can see the query:

Virtuoso HTCLI Error HC001: Connection Error in HTTP Client

SPARQL query:
define sql:big-data-const 0
#output-format:text/html
define sql:signal-void-variables 1
define sql:gs-app-callback "ODS"
select *
where { service <http://dbpedia.org/sparql> {
dbpedia:Uruguay ?p ?o
}
}
define get:proxy "proxy.fing.edu.uy:3128"
  • The original query with the previous pragma gets the 22023 RDFXX…

All these elements, makes me think about openssl library wich is openssl-1.0.2k-21.el7_9.x86_64 or about some capability about following redirects that might be disabled.

Thanks.
FDO.

Can you please confirm the version and gitid for the new binary being used with the queries:

select sys_stat ('git_head');
select sys_stat ('st_dbms_ver');
select sys_stat ('st_build_date');

For the Virtuoso HTCLI Error HC001: Connection Error in HTTP Client error can you please enable the following params in the [Parameters] section of the virtuoso.ini file:

PLDebug = 2 
CallstackOnException =2

to obtain a more detailed stack trace of the call leading up to the error in the virtuoso.log or on the client side ie browser as detailed in the documentation and config documentation.

Are you saying that when the define get:proxy "proxy.fing.edu.uy:3128" pragma is set in the query the Virtuoso 22023 Error RDFXX: The HTTPS retrieval is not supported via proxy error is still occurring ?

Hello.

For the query for the test, please tell me if these are the correct results:

SQL> select sys_stat ('git_head');
sys_stat
LONG VARCHAR
_______________________________________________________________________________

385790a

1 Rows. -- 1 msec.
SQL> select sys_stat ('st_dbms_ver');
sys_stat
LONG VARCHAR
_______________________________________________________________________________

07.20.3233

1 Rows. -- 1 msec.
SQL> select sys_stat ('st_build_date');
sys_stat
LONG VARCHAR
_______________________________________________________________________________

Aug 23 2021

1 Rows. -- 1 msec.

Concerning the error, yes, that is correct. With proxy pragma:

SQL> sparql define get:proxy "proxy.fing.edu.uy:3128" define get:soft "soft" SELECT * FROM  <http://dbpedia.org/resource/Uruguay> WHERE { ?s ?p ?o } ;

*** Error 22023: [Virtuoso Driver][Virtuoso Server]RDFXX: The HTTPS retrieval is not supported via proxy
at line 12 of Top-Level:
sparql define get:proxy "proxy.fing.edu.uy:3128" define get:soft "soft" SELECT * FROM  <http://dbpedia.org/resource/Uruguay> WHERE { ?s ?p ?o } 
SQL> 

The same query without proxy pragma:

SQL> sparql define get:soft "soft" SELECT * FROM  <http://dbpedia.org/resource/Uruguay> WHERE { ?s ?p ?o } ;

*** Error HTCLI: [Virtuoso Driver][Virtuoso Server]HC001: Connection Error in HTTP Client
at line 13 of Top-Level:
sparql define get:soft "soft" SELECT * FROM  <http://dbpedia.org/resource/Uruguay> WHERE { ?s ?p ?o } 
SQL> 

As additional information, I had tested the dbpedia.org using curl and only works with the ssl1 protocol.

I’ll try the new parameters and I will send the results as I get them.

Thanks for the support!
FDO.

When you added those new params in the INI file did you restart the Virtuoso Server for the new settings to take effect ? Also, is anything extra written in the Virtuoso log file ?

Does the proxy possibly require authentication ?

Hello.
With the new parameters, I did some tests with the following queries:

  1. Failed - logs at the end below:
select * where { service <http://dbpedia.org/sparql> { dbpedia:Uruguay ?p ?o } }   
  1. Works fine:
select * where { service <http://lod.openlinksw.com/sparql> { dbpedia:Uruguay ?p ?o } }   ;
  1. Failed - logs at the end:
define get:soft "soft" select * from <http://dbpedia.org> where {  dbpedia:Uruguay ?p ?o }  
  1. Works fine:
define get:soft "soft" SELECT * FROM  <http://lod.openlinksw.com/c/9BW76GGL> WHERE { ?s ?p ?o } ;

As I said in a previous message, this situation makes me think about how is making the connection in case of 303 . What’s the meaning of n_redirects= 0 in http_client_ext ?

Thanks.
FDO.

LOGS

1.

SQL> sparql select * where { service <http://dbpedia.org/sparql> { dbpedia:Uruguay ?p ?o } }   
Type the rest of statement, end with a semicolon (;)> ;

*** Error HTCLI: [Virtuoso Driver][Virtuoso Server]HC001: Connection Error in HTTP Client
in
http_get:(BIF),
        __01 => 'http://dbpedia.org/sparql?query=%20SELECT%20%3Fo%20%3Fp%0A%20WHERE%20%7B%20%20%3Chttp%3A%2F%2Fdbpedi' (truncated),
        __02 => 0,
        __03 => 'GET',
        __04 => 'Accept: application/sparql-results+xml, text/rdf+n3, text/rdf+ttl, text/rdf+turtle, text/turtle, app' (truncated),
        __05 => '',
        __06 => NULL,
        __07 => 15,
DB.DBA.SPARQL_REXEC_INT([executable]/sparql_io.sql:300),
    res_mode => 1,
  res_make_obj => 1,
     service => 'http://dbpedia.org/sparql',
       query => ' SELECT ?o ?p
 WHERE {  <http://dbpedia.org/resource/Uruguay> ?p ?o . }',
  dflt_graph => NULL,
  named_graphs => NULL,
     req_hdr => 'Accept: application/sparql-results+xml, text/rdf+n3, text/rdf+ttl, text/rdf+turtle, text/turtle, app' (truncated),
     maxrows => 10000000,
       metas => 0,
  bnode_dict => NULL,
  expected_var_list => (ARRAY_OF_POINTER value, tag 193),
     options => NULL,
DB.DBA.SPARQL_REXEC_TO_ARRAY_OF_OBJ([executable]/sparql_io.sql:578),
     service => 'http://dbpedia.org/sparql',
       query => ' SELECT ?o ?p
 WHERE {  <http://dbpedia.org/resource/Uruguay> ?p ?o . }',
  dflt_graph => NULL,
  named_graphs => NULL,
     req_hdr => 'Accept: application/sparql-results+xml, text/rdf+n3, text/rdf+ttl, text/rdf+turtle, text/turtle, app' (truncated),
     maxrows => 10000000,
  bnode_dict => NULL,
  expected_var_list => (ARRAY_OF_POINTER value, tag 193),
     options => NULL,
DB.DBA.SPARQL_SINV_IMP([executable]/sparql_io.sql:936),
  ws_endpoint => 'http://dbpedia.org/sparql',
   ws_params => (ARRAY_OF_POINTER value, tag 193),
  qtext_template => ' SELECT ?o ?p
 WHERE {  <http://dbpedia.org/resource/Uruguay> ?p ?o . }',
  qtext_posmap => (NVARCHAR value, tag 225),
   param_row => (ARRAY_OF_POINTER value, tag 193),
  expected_vars => (ARRAY_OF_POINTER value, tag 193)
in lines 4-5 of Top-Level:
#line 4 "(console)"
sparql select * where { service <http://dbpedia.org/sparql> { dbpedia:Uruguay ?p ?o } }

3.

SQL> sparql define get:soft "soft" select * from <http://dbpedia.org> where {  dbpedia:Uruguay ?p ?o }   ;

*** Error HTCLI: [Virtuoso Driver][Virtuoso Server]HC001: Connection Error in HTTP Client
in
http_client_internal:(BIF),
        __01 => 'https://www.dbpedia.org/',
        __02 => NULL,
        __03 => NULL,
        __04 => 'GET',
        __05 => 'User-Agent: OpenLink Virtuoso RDF crawler
Accept: application/rdf+xml; q=1.0, text/rdf+n3; q=0.9, a' (truncated),
        __06 => NULL,
        __07 => '1',
        __08 => NULL,
        __09 => (ARRAY_OF_POINTER value, tag 193),
        __10 => NULL,
        __11 => NULL,
        __12 => NULL,
        __13 => 0,
        __14 => 0,
DB.DBA.HTTP_CLIENT_EXT([executable]/system.sql:3223),
         url => 'https://www.dbpedia.org/',
         uid => NULL,
         pwd => NULL,
  http_method => 'GET',
  http_headers => 'User-Agent: OpenLink Virtuoso RDF crawler
Accept: application/rdf+xml; q=1.0, text/rdf+n3; q=0.9, a' (truncated),
        body => NULL,
   cert_file => '1',
    cert_pwd => NULL,
     headers => (ARRAY_OF_POINTER value, tag 193),
     timeout => NULL,
       proxy => NULL,
    ca_certs => NULL,
    insecure => 0,
  n_redirects => 0,
DB.DBA.RDF_HTTP_URL_GET([executable]/rdf_sponge.sql:1400),
         url => 'https://www.dbpedia.org/',
        base => '',
         hdr => (ARRAY_OF_POINTER value, tag 193),
        meth => 'GET',
     req_hdr => 'User-Agent: OpenLink Virtuoso RDF crawler
Accept: application/rdf+xml; q=1.0, text/rdf+n3; q=0.9, a' (truncated),
         cnt => NULL,
       proxy => NULL,
         sig => 0,
signal:(BIF),
        __01 => 'HTCLI',
        __02 => 'HC001: Connection Error in HTTP Client
in
http_client_internal:(BIF),
        __01 => 'https://www.d' (truncated),
DB.DBA.SYS_HTTP_SPONGE_UP([executable]/rdf_sponge.sql:993),
   local_iri => 'http://dbpedia.org',
     get_uri => 'http://dbpedia.org',
      parser => 'DB.DBA.RDF_LOAD_HTTP_RESPONSE',
      eraser => 'DB.DBA.RDF_FORGET_HTTP_RESPONSE',
     options => (ARRAY_OF_POINTER value, tag 193),
DB.DBA.RDF_SPONGE_UP_1([executable]/rdf_sponge.sql:2195),
   graph_iri => 'http://dbpedia.org',
     options => (ARRAY_OF_POINTER value, tag 193),
         uid => 0,
DB.DBA.RDF_SPONGE_UP([executable]/rdf_sponge.sql:2055),
   graph_iri => 'http://dbpedia.org',
     options => (ARRAY_OF_POINTER value, tag 193),
         uid => 0
at line 17 of Top-Level:
sparql define get:soft "soft" select * from <http://dbpedia.org> where {  dbpedia:Uruguay ?p ?o }   

Is any extra error message output returned when running the query with the sponger proxy pragma ie

sparql define get:proxy "proxy.fing.edu.uy:3128" define get:soft "soft" SELECT * FROM  <http://dbpedia.org/resource/Uruguay> WHERE { ?s ?p ?o } ;
SQL> sparql define get:proxy "proxy.fing.edu.uy:3128" define get:soft "soft" select * from <http://dbpedia.org> where {  dbpedia:Uruguay ?p ?o };

*** Error 22023: [Virtuoso Driver][Virtuoso Server]RDFXX: The HTTPS retrieval is not supported via proxy
in
signal:(BIF),
        __01 => '22023',
        __02 => 'The HTTPS retrieval is not supported via proxy',
        __03 => 'RDFXX',
DB.DBA.RDF_HTTP_URL_GET([executable]/rdf_sponge.sql:1394),
         url => 'https://www.dbpedia.org/',
        base => '',
         hdr => (ARRAY_OF_POINTER value, tag 193),
        meth => 'GET',
     req_hdr => 'User-Agent: OpenLink Virtuoso RDF crawler
Accept: application/rdf+xml; q=1.0, text/rdf+n3; q=0.9, a' (truncated),
         cnt => NULL,
       proxy => 'proxy.fing.edu.uy:3128',
         sig => 0,
signal:(BIF),
        __01 => '22023',
        __02 => 'RDFXX: The HTTPS retrieval is not supported via proxy
in
signal:(BIF),
        __01 => '22023',
    ' (truncated),
DB.DBA.SYS_HTTP_SPONGE_UP([executable]/rdf_sponge.sql:993),
   local_iri => 'http://dbpedia.org',
     get_uri => 'http://dbpedia.org',
      parser => 'DB.DBA.RDF_LOAD_HTTP_RESPONSE',
      eraser => 'DB.DBA.RDF_FORGET_HTTP_RESPONSE',
     options => (ARRAY_OF_POINTER value, tag 193),
DB.DBA.RDF_SPONGE_UP_1([executable]/rdf_sponge.sql:2195),
   graph_iri => 'http://dbpedia.org',
     options => (ARRAY_OF_POINTER value, tag 193),
         uid => 0,
DB.DBA.RDF_SPONGE_UP([executable]/rdf_sponge.sql:2055),
   graph_iri => 'http://dbpedia.org',
     options => (ARRAY_OF_POINTER value, tag 193),
         uid => 0
at line 2 of Top-Level:
sparql define get:proxy "proxy.fing.edu.uy:3128" define get:soft "soft" select * from <http://dbpedia.org> where {  dbpedia:Uruguay ?p ?o }

FDO

@fcarpani /CC @hwilliams

I fixed the proxy handling in the Virtuoso Open Source Edition that caused the “The HTTPS retrieval is not supported…” in this commit on Github

Please pull the latest fixes from the develop/7 branch and rebuild your binary.

Let us know if this resolves your issue.

Hello again !
I have the new version.
The configuration seems to be ok.
Now, with the get:proxy flag online returns an HC001 (error in HTTP client).

But, I compiled the same version with the same spec file in my personal notebook (Fedora).
My notebook does not need the proxy to connect to the internet.

The following query works fine without a proxy but fails with HC001 behind the proxy on the server.

define get:soft "soft"
select *
from <https://dbpedia.org/resource/Uruguay>
where { ?s ?p ?o }
limit 100

The same behaviour is present with a similar query but using SERVICE without sponging.

FDO.

====== RUNNING CONFIG =====

====== VERSION TEST =======

SQL> select sys_stat ('git_head');
sys_stat
LONG VARCHAR
_______________________________________________________________________________

385790a

1 Rows. -- 0 msec.
SQL> select sys_stat ('st_dbms_ver');
sys_stat
LONG VARCHAR
_______________________________________________________________________________

07.20.3233

1 Rows. -- 0 msec.
SQL> select sys_stat ('st_build_date');
sys_stat
LONG VARCHAR
_______________________________________________________________________________

Aug 30 2021

1 Rows. -- 1 msec.
SQL> 

I have reconfigured the network on my development
machine and have finally been able to recreate this second error. I will let you know as soon as the patch is ready on github.