Hi guys!
It looks like the turtle output from the LDP server isn’t (always) correct.
curl -L -H "Accept: text/turtle" http://training.fairdata.solutions/DAV/home/LDP/gofair/ | less
You’ll see that there are a few @prefix
lines at the beginning, then a line of data, then another @prefix
line, followed by the rest of the data. This crashes my turtle parser. This just started happening this morning - everything was working fine on Friday.
Any ideas?
Mark
Hi guys! Ummm… I’m a bit desperate on this issue. I have to give a workshop using LDP in a few days, and right now, I cannot even parse the messages coming back from the server.
Any advice appreciated… URGENTLY!
thanks!
Mark
Ummmmm… hello? Anybody home?
I wiped the folder and started from scratch. I’ll let you know if I see this again.
feel free to close this for now.
M
Hi guys,
So yes, I can confirm now that this is a real problem. It happens once the LDP container reaches ~800-1000 records. It happens consistently (I have created fresh “folders” 3 times, and each time this bug has appeared after I put about a thousand records into that container). It is a problem with the headers - if I take only the first 50 lines of turtle and pass it to my parser (Ruby raptor), it segfaults.
Any advice is greatly appreciated.
(by the way, if I request application/ld+json
instead of text/turtle
, it crashes virtuoso entirely and I have to restart)
@markwilkinson: Is the query in your first post ie
curl -L -H “Accept: text/turtle” http://training.fairdata.solutions/DAV/home/LDP/gofair/
suppose to exhibit the problem currently, as it does not appear ie I don’t see any @prefix headers ?
Are you able to provide steps to recreate such that it can be recreated locally ?
They come out-of-order when I make that call…??
$ curl -L -H 'Accept: text/turtle' http://training.fairdata.solutions/DAV/home/LDP/gofair/ > t.ttl
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 82520 100 82520 0 0 6078 0 0:00:13 0:00:13 --:--:-- 18922
markw@markw ~/Documents/CODE $ head -20 t.ttl
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
rdfs:Resource rdf:type rdfs:Class .
@prefix ns2: <http://training.fairdata.solutions/DAV/home/LDP/gofair/> .
@prefix ldp: <http://www.w3.org/ns/ldp#> .
ns2:obs_2147365908 rdf:type ldp:Resource .
@prefix ns4: <http://semanticscience.org/resource/> .
ns2:obs_2147365908 rdf:type ns4:measuring ,
rdfs:Resource .
ns2:species_290307346 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
ns2:species_290308565 rdf:type ns4:pathogen ,
rdfs:Resource ,
ldp:Resource .
ns2:species_290307396 rdf:type ns4:pathogen ,
rdfs:Resource ,
ldp:Resource .
ns2:species_290310811 rdf:type ns4:pathogen ,
rdfs:Resource ,
ldp:Resource .
ns2:species_290307202 rdf:type ns4:pathogen ,
rdfs:Resource ,
ldp:Resource .
ns2:species_290310128 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
ns2:species_290309376 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
ns2:species_290310064 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
ns2:species_290307373 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
ns2:species_290309635 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
ns2:species_290309803 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
ns2:species_290307801 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
ns2:species_290307721 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
ns2:species_290307422 rdf:type ldp:Resource ,
ns4:pathogen ,
rdfs:Resource .
@prefix ns5: <http://purl.org/dc/dcmitype/> .
ns2: rdf:type ns5:Dataset ,
ldp:Container ,
ldp:BasicContainer .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@markwilkinson: Can you please confirm the version of the Virtuoso binary (virtuoso-t -?
) and VAD applications (vad_list_packages()
) you have installed, such that I can attempt to setup a test case locally ?
Also I presume the files in your http://training.fairdata.solutions/DAV/home/LDP/gofair/ folder are publicly available for download? What method do you use for loading them into the Virtuoso WebDAV folder?
Virtuoso Open Source Edition (Column Store) (multi threaded)
Version 7.2.6-rc1.3230-pthreads as of Nov 2 2018 (4d226f4)
Compiled for Linux (x86_64-generic_glibc25-linux-gnu)
Copyright (C) 1998-2018 OpenLink Software
(using the latest openlink Docker image)
Connected to OpenLink Virtuoso
Driver: 07.20.3230 OpenLink Virtuoso ODBC Driver
name title version build_date install_date
VARCHAR VARCHAR VARCHAR VARCHAR VARCHAR
_______________________________________________________________________________
Briefcase ODS Briefcase 1.21.68 2018-08-16 12:08 2018-12-11 02:11
Framework ODS Framework 1.89.47 2018-08-16 12:06 2018-12-11 02:10
conductor Virtuoso Conductor 1.00.8785 2018-11-02 11:55 2018-12-11 02:09
Yes, the data is publicly available for download.
I use LDP POST
of an RDF resource, with a Slug
header for the desired filename, and Accept: text/turtle
, Content-type: text/turtle
, to the /DAV/home/LDP/gofair/
ldp:Container
For the first few hundred there is no problem (or at least, I have never seen a problem). It’s only when I get into the higher numbers - using exactly the same script, just breaking out of it at different times.
Hope that helps! Cheers!
@markwilkinson: Are you able to provide the script you run and a set of files to upload to a test LDP container to recreate the problem locally ?
not easily, unfortunately. It uses libraries that are not publicly available.
Re-confirming: Same script, same dataset, same order of record loading (http POST
to an LDP container). It “breaks” at arbitrary times (today after just ~500 records!). Loading just 100 records was fine.
Very odd! (and a real blocker for me, unfortunately…)
OK, I now have some (Ruby) code that reproduces this problem that you can run.
If you do just 10-20 iterations of this script, the PREFIX
headers representing the Turtle of the Container seem to be fine. If you bump it up to 1000, they invariably come out of order (curl -L -H "Accept: text/turtle" http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/ | less
)
require 'rdf'
require 'rdf/turtle'
def triplify(s, p, o, repo)
if s.class == String
s = s.strip
end
if p.class == String
p = p.strip
end
if o.class == String
o = o.strip
end
unless s.respond_to?('uri')
if s.to_s =~ /^\w+:\/?\/?[^\s]+/
s = RDF::URI.new(s.to_s)
else
abort "Subject #{s.to_s} must be a URI-compatible thingy"
end
end
unless p.respond_to?('uri')
if p.to_s =~ /^\w+:\/?\/?[^\s]+/
p = RDF::URI.new(p.to_s)
else
abort "Predicate #{p.to_s} must be a URI-compatible thingy"
end
end
unless o.respond_to?('uri')
if o.to_s =~ /^\w+:\/?\/?[^\s]+/
o = RDF::URI.new(o.to_s)
elsif o.to_s =~ /^\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d/
o = RDF::Literal.new(o.to_s, :datatype => RDF::XSD.date)
elsif o.to_s =~ /^[+-]?\d+\.\d+/
o = RDF::Literal.new(o.to_s, :datatype => RDF::XSD.float)
elsif o.to_s =~ /^[+-]?[0-9]+$/
o = RDF::Literal.new(o.to_s, :datatype => RDF::XSD.int)
else
o = RDF::Literal.new(o.to_s, :language => :en)
end
end
triple = RDF::Statement(s, p, o)
repo.insert(triple)
return true
end
rdf = RDF::Vocabulary.new("http://www.w3.org/1999/02/22-rdf-syntax-ns#")
sio = RDF::Vocabulary.new("http://semanticscience.org/resource/")
my = RDF::Vocabulary.new("http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/")
(1..1000).each do |count| # this is the iterator
observation = "obs_#{count}" # obs_123
g = RDF::Graph.new()
triplify(my["#{observation}#observation"], rdf.type, sio.measuring, g)
RDF::Turtle::Writer.open("./test.ttl") do |writer|
writer << g # write the little piece of turtle to ./test.ttl
end
system "curl -u gofair:gofair -L -H 'Accept: text/turtle' -H 'Content-type: text/turtle' -H 'Slug: #{observation}' --data-binary @test.ttl http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/"
puts $? # check for error messages
end
So, when you run that, do the curl
to retrieve the container turtle and you’ll see the prefix headers coming out-of-order.
Hope there’s a quick fix for this… it really is killing me! LOL!
@markwilkinson: I have a base ruby installation, which I have used previously for customer test cases, but how exactly should your ruby script be run?
If I save a copy of it and try to run as is, the following error occurs:
De-iMac-2396:~ hwilliams$ ruby mark.rb
/System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- rdf (LoadError)
from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from mark.rb:1:in `<main>'
De-iMac-2396:~ hwilliams$
Do ruby modules/packages/gems for require 'rdf'
and require 'rdf/turtle'
need to be installed on my local ruby installation?
yes
gem install rdf rdf-turtle
I installed the gems and now the ruby program seems to run:
De-iMac-2396:~ hwilliams$ sudo gem install rdf rdf-turtle
Password:
Fetching: link_header-0.0.8.gem (100%)
Successfully installed link_header-0.0.8
Fetching: concurrent-ruby-1.1.5.gem (100%)
Successfully installed concurrent-ruby-1.1.5
Fetching: hamster-3.0.0.gem (100%)
Successfully installed hamster-3.0.0
Fetching: rdf-3.0.12.gem (100%)
Successfully installed rdf-3.0.12
Parsing documentation for link_header-0.0.8
Installing ri documentation for link_header-0.0.8
Parsing documentation for concurrent-ruby-1.1.5
Installing ri documentation for concurrent-ruby-1.1.5
Parsing documentation for hamster-3.0.0
Installing ri documentation for hamster-3.0.0
Parsing documentation for rdf-3.0.12
Installing ri documentation for rdf-3.0.12
Done installing documentation for link_header, concurrent-ruby, hamster, rdf after 10 seconds
Fetching: sxp-1.0.2.gem (100%)
Successfully installed sxp-1.0.2
Fetching: ebnf-1.1.3.gem (100%)
Successfully installed ebnf-1.1.3
Fetching: rdf-turtle-3.0.6.gem (100%)
Successfully installed rdf-turtle-3.0.6
Parsing documentation for sxp-1.0.2
Installing ri documentation for sxp-1.0.2
Parsing documentation for ebnf-1.1.3
Installing ri documentation for ebnf-1.1.3
Parsing documentation for rdf-turtle-3.0.6
Installing ri documentation for rdf-turtle-3.0.6
Done installing documentation for sxp, ebnf, rdf-turtle after 1 seconds
7 gems installed
De-iMac-2396:~ hwilliams$ ruby mark.rb
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_1 has been created.</BODY></HTML>pid 63255 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_2 has been created.</BODY></HTML>pid 63256 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_3 has been created.</BODY></HTML>pid 63257 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_4 has been created.</BODY></HTML>pid 63258 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_5 has been created.</BODY></HTML>pid 63259 exit 0
.
.
.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_998 has been created.</BODY></HTML>pid 64253 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_999 has been created.</BODY></HTML>pid 64254 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_1000 has been created.</BODY></HTML>pid 64255 exit 0
De-iMac-2396:~ hwilliams$
Then running the curl -L -H “Accept: text/turtle” http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/ | less
command I see:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ns1: <http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
ns1:obs_1 rdf:type rdfs:Resource .
@prefix ldp: <http://www.w3.org/ns/ldp#> .
ns1:obs_1 rdf:type ldp:Resource .
@prefix ns4: <http://www.w3.org/ns/posix/stat#> .
ns1:obs_1 ns4:mtime 1559923421 ;
ns4:size 135 .
ns1:obs_10 rdf:type rdfs:Resource ,
ldp:Resource ;
ns4:mtime 1559923421 ;
ns4:size 136 .
.
.
.
and I presume the
ns1:obs_1 rdf:type rdfs:Resource .
and
ns1:obs_1 rdf:type ldp:Resource .
triples in-between the prefix values are the problem?
Attached is the complete output of the curl
command …
curl.zip (6.7 KB)
Correct, that is the problem
@markwilkinson: I amended your Ruby application replacing your LDP container folder references ie http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/
with my local LDP container folder ie http://localhost:8890/DAV/ldp/
… but when I run the program repeatedly gives the error:
pid 89544 exit 0
pid 89545 exit 0
pid 89546 exit 0
and I am not sure why, are there any other changes that would be required to run against another instance ?
Nothing should need to be changed. I made that code as generic as possible so that you could recreate it on any server.
You should check the system curl
command at the command line… My best guess is that you’re using my server’s username/password, but it could be other things - you’ll see them if you use the exact same curl
command from a prompt.
curl -u yourusername:yourpassword -L -H 'Accept: text/turtle' -H 'Content-type: text/turtle' -H 'Slug: put_a_name_here' --data-binary @test.ttl 'ttp://localhost:8890/DAV/ldp/'