@prefix headers coming out-of-order

Hi guys!

It looks like the turtle output from the LDP server isn’t (always) correct.

curl -L -H "Accept: text/turtle" http://training.fairdata.solutions/DAV/home/LDP/gofair/ | less

You’ll see that there are a few @prefix lines at the beginning, then a line of data, then another @prefix line, followed by the rest of the data. This crashes my turtle parser. This just started happening this morning - everything was working fine on Friday.

Any ideas?

Mark

Hi guys! Ummm… I’m a bit desperate on this issue. I have to give a workshop using LDP in a few days, and right now, I cannot even parse the messages coming back from the server.

Any advice appreciated… URGENTLY!

thanks!
Mark

Ummmmm… hello? Anybody home?

I wiped the folder and started from scratch. I’ll let you know if I see this again.

feel free to close this for now.

M

Hi guys,

So yes, I can confirm now that this is a real problem. It happens once the LDP container reaches ~800-1000 records. It happens consistently (I have created fresh “folders” 3 times, and each time this bug has appeared after I put about a thousand records into that container). It is a problem with the headers - if I take only the first 50 lines of turtle and pass it to my parser (Ruby raptor), it segfaults.

Any advice is greatly appreciated.

(by the way, if I request application/ld+json instead of text/turtle, it crashes virtuoso entirely and I have to restart)

@markwilkinson: Is the query in your first post ie

curl -L -H “Accept: text/turtle” http://training.fairdata.solutions/DAV/home/LDP/gofair/

suppose to exhibit the problem currently, as it does not appear ie I don’t see any @prefix headers ?

Are you able to provide steps to recreate such that it can be recreated locally ?

They come out-of-order when I make that call…??

$ curl -L -H 'Accept: text/turtle' http://training.fairdata.solutions/DAV/home/LDP/gofair/ > t.ttl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 82520  100 82520    0     0   6078      0  0:00:13  0:00:13 --:--:-- 18922
markw@markw ~/Documents/CODE $ head -20 t.ttl 
@prefix rdf:	<http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:	<http://www.w3.org/2000/01/rdf-schema#> .
rdfs:Resource	rdf:type	rdfs:Class .
@prefix ns2:	<http://training.fairdata.solutions/DAV/home/LDP/gofair/> .
@prefix ldp:	<http://www.w3.org/ns/ldp#> .
ns2:obs_2147365908	rdf:type	ldp:Resource .
@prefix ns4:	<http://semanticscience.org/resource/> .
ns2:obs_2147365908	rdf:type	ns4:measuring ,
		rdfs:Resource .
ns2:species_290307346	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
ns2:species_290308565	rdf:type	ns4:pathogen ,
		rdfs:Resource ,
		ldp:Resource .
ns2:species_290307396	rdf:type	ns4:pathogen ,
		rdfs:Resource ,
		ldp:Resource .
ns2:species_290310811	rdf:type	ns4:pathogen ,
		rdfs:Resource ,
		ldp:Resource .
ns2:species_290307202	rdf:type	ns4:pathogen ,
		rdfs:Resource ,
		ldp:Resource .
ns2:species_290310128	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
ns2:species_290309376	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
ns2:species_290310064	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
ns2:species_290307373	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
ns2:species_290309635	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
ns2:species_290309803	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
ns2:species_290307801	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
ns2:species_290307721	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
ns2:species_290307422	rdf:type	ldp:Resource ,
		ns4:pathogen ,
		rdfs:Resource .
@prefix ns5:	<http://purl.org/dc/dcmitype/> .
ns2:	rdf:type	ns5:Dataset ,
		ldp:Container ,
		ldp:BasicContainer .
@prefix dc:	<http://purl.org/dc/elements/1.1/> .

@markwilkinson: Can you please confirm the version of the Virtuoso binary (virtuoso-t -?) and VAD applications (vad_list_packages()) you have installed, such that I can attempt to setup a test case locally ?

Also I presume the files in your http://training.fairdata.solutions/DAV/home/LDP/gofair/ folder are publicly available for download? What method do you use for loading them into the Virtuoso WebDAV folder?

Virtuoso Open Source Edition (Column Store) (multi threaded)
Version 7.2.6-rc1.3230-pthreads as of Nov  2 2018 (4d226f4)
Compiled for Linux (x86_64-generic_glibc25-linux-gnu)
Copyright (C) 1998-2018 OpenLink Software

(using the latest openlink Docker image)

Connected to OpenLink Virtuoso
Driver: 07.20.3230 OpenLink Virtuoso ODBC Driver
name     title    version  build_date  install_date
VARCHAR  VARCHAR  VARCHAR  VARCHAR  VARCHAR
_______________________________________________________________________________

Briefcase  ODS Briefcase  1.21.68  2018-08-16 12:08  2018-12-11 02:11
Framework  ODS Framework  1.89.47  2018-08-16 12:06  2018-12-11 02:10
conductor  Virtuoso Conductor  1.00.8785  2018-11-02 11:55  2018-12-11 02:09

Yes, the data is publicly available for download.

I use LDP POST of an RDF resource, with a Slug header for the desired filename, and Accept: text/turtle, Content-type: text/turtle, to the /DAV/home/LDP/gofair/ ldp:Container

For the first few hundred there is no problem (or at least, I have never seen a problem). It’s only when I get into the higher numbers - using exactly the same script, just breaking out of it at different times.

Hope that helps! Cheers!

@markwilkinson: Are you able to provide the script you run and a set of files to upload to a test LDP container to recreate the problem locally ?

not easily, unfortunately. It uses libraries that are not publicly available.

Re-confirming: Same script, same dataset, same order of record loading (http POST to an LDP container). It “breaks” at arbitrary times (today after just ~500 records!). Loading just 100 records was fine.

Very odd! (and a real blocker for me, unfortunately…)

OK, I now have some (Ruby) code that reproduces this problem that you can run.

If you do just 10-20 iterations of this script, the PREFIX headers representing the Turtle of the Container seem to be fine. If you bump it up to 1000, they invariably come out of order (curl -L -H "Accept: text/turtle" http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/ | less)

    require 'rdf'
    require 'rdf/turtle'
        def triplify(s, p, o, repo)
      
          if s.class == String
                  s = s.strip
          end
          if p.class == String
                  p = p.strip
          end
          if o.class == String
                  o = o.strip
          end
          
          unless s.respond_to?('uri')
            
            if s.to_s =~ /^\w+:\/?\/?[^\s]+/
                    s = RDF::URI.new(s.to_s)
            else
              abort "Subject #{s.to_s} must be a URI-compatible thingy"
            end
          end          
          unless p.respond_to?('uri')        
            if p.to_s =~ /^\w+:\/?\/?[^\s]+/
                    p = RDF::URI.new(p.to_s)
            else
              abort "Predicate #{p.to_s} must be a URI-compatible thingy"
            end
          end
          unless o.respond_to?('uri')
            if o.to_s =~ /^\w+:\/?\/?[^\s]+/
                    o = RDF::URI.new(o.to_s)
            elsif o.to_s =~ /^\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d/
                    o = RDF::Literal.new(o.to_s, :datatype => RDF::XSD.date)
            elsif o.to_s =~ /^[+-]?\d+\.\d+/
                    o = RDF::Literal.new(o.to_s, :datatype => RDF::XSD.float)
            elsif o.to_s =~ /^[+-]?[0-9]+$/
                    o = RDF::Literal.new(o.to_s, :datatype => RDF::XSD.int)
            else
                    o = RDF::Literal.new(o.to_s, :language => :en)
            end
          end
      
          triple = RDF::Statement(s, p, o) 
          repo.insert(triple)
          return true
        end


    rdf =  RDF::Vocabulary.new("http://www.w3.org/1999/02/22-rdf-syntax-ns#")
    sio = RDF::Vocabulary.new("http://semanticscience.org/resource/")
    my =   RDF::Vocabulary.new("http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/")

    (1..1000).each do |count|   # this is the iterator
      observation = "obs_#{count}"    # obs_123
    	
      g = RDF::Graph.new()

      triplify(my["#{observation}#observation"], rdf.type, sio.measuring, g)

    	RDF::Turtle::Writer.open("./test.ttl") do |writer|
    		writer << g  # write the little piece of turtle to ./test.ttl
    	end

    	system "curl -u gofair:gofair -L -H 'Accept: text/turtle' -H 'Content-type: text/turtle' -H 'Slug: #{observation}' --data-binary @test.ttl http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/"
    	puts $?   # check for error messages

    end

So, when you run that, do the curl to retrieve the container turtle and you’ll see the prefix headers coming out-of-order.

Hope there’s a quick fix for this… it really is killing me! LOL!

@markwilkinson: I have a base ruby installation, which I have used previously for customer test cases, but how exactly should your ruby script be run?

If I save a copy of it and try to run as is, the following error occurs:

De-iMac-2396:~ hwilliams$ ruby mark.rb
/System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- rdf (LoadError)
	from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
	from mark.rb:1:in `<main>'
De-iMac-2396:~ hwilliams$ 

Do ruby modules/packages/gems for require 'rdf' and require 'rdf/turtle' need to be installed on my local ruby installation?

yes

gem install rdf rdf-turtle

I installed the gems and now the ruby program seems to run:

De-iMac-2396:~ hwilliams$ sudo gem install rdf rdf-turtle
Password:
Fetching: link_header-0.0.8.gem (100%)
Successfully installed link_header-0.0.8
Fetching: concurrent-ruby-1.1.5.gem (100%)
Successfully installed concurrent-ruby-1.1.5
Fetching: hamster-3.0.0.gem (100%)
Successfully installed hamster-3.0.0
Fetching: rdf-3.0.12.gem (100%)
Successfully installed rdf-3.0.12
Parsing documentation for link_header-0.0.8
Installing ri documentation for link_header-0.0.8
Parsing documentation for concurrent-ruby-1.1.5
Installing ri documentation for concurrent-ruby-1.1.5
Parsing documentation for hamster-3.0.0
Installing ri documentation for hamster-3.0.0
Parsing documentation for rdf-3.0.12
Installing ri documentation for rdf-3.0.12
Done installing documentation for link_header, concurrent-ruby, hamster, rdf after 10 seconds
Fetching: sxp-1.0.2.gem (100%)
Successfully installed sxp-1.0.2
Fetching: ebnf-1.1.3.gem (100%)
Successfully installed ebnf-1.1.3
Fetching: rdf-turtle-3.0.6.gem (100%)
Successfully installed rdf-turtle-3.0.6
Parsing documentation for sxp-1.0.2
Installing ri documentation for sxp-1.0.2
Parsing documentation for ebnf-1.1.3
Installing ri documentation for ebnf-1.1.3
Parsing documentation for rdf-turtle-3.0.6
Installing ri documentation for rdf-turtle-3.0.6
Done installing documentation for sxp, ebnf, rdf-turtle after 1 seconds
7 gems installed
De-iMac-2396:~ hwilliams$ ruby mark.rb
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_1 has been created.</BODY></HTML>pid 63255 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_2 has been created.</BODY></HTML>pid 63256 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_3 has been created.</BODY></HTML>pid 63257 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_4 has been created.</BODY></HTML>pid 63258 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_5 has been created.</BODY></HTML>pid 63259 exit 0
.
.
.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_998 has been created.</BODY></HTML>pid 64253 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_999 has been created.</BODY></HTML>pid 64254 exit 0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><HTML><HEAD><TITLE>201 Created</TITLE></HEAD><BODY><H1>Created</H1>Resource /DAV/home/LDP/fair/grazing2/obs_1000 has been created.</BODY></HTML>pid 64255 exit 0
De-iMac-2396:~ hwilliams$

Then running the curl -L -H “Accept: text/turtle” http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/ | less command I see:


@prefix rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ns1:    <http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/> .
@prefix rdfs:   <http://www.w3.org/2000/01/rdf-schema#> .
ns1:obs_1       rdf:type        rdfs:Resource .
@prefix ldp:    <http://www.w3.org/ns/ldp#> .
ns1:obs_1       rdf:type        ldp:Resource .
@prefix ns4:    <http://www.w3.org/ns/posix/stat#> .
ns1:obs_1       ns4:mtime       1559923421 ;
        ns4:size        135 .
ns1:obs_10      rdf:type        rdfs:Resource ,
                ldp:Resource ;
        ns4:mtime       1559923421 ;
        ns4:size        136 .
.
.
.

and I presume the

ns1:obs_1       rdf:type        rdfs:Resource .

and

ns1:obs_1       rdf:type        ldp:Resource .

triples in-between the prefix values are the problem?

Attached is the complete output of the curl command …

curl.zip (6.7 KB)

Correct, that is the problem

@markwilkinson: I amended your Ruby application replacing your LDP container folder references ie http://training.fairdata.solutions/DAV/home/LDP/fair/grazing2/ with my local LDP container folder ie http://localhost:8890/DAV/ldp/ … but when I run the program repeatedly gives the error:

pid 89544 exit 0
pid 89545 exit 0
pid 89546 exit 0

and I am not sure why, are there any other changes that would be required to run against another instance ?

Nothing should need to be changed. I made that code as generic as possible so that you could recreate it on any server.

You should check the system curl command at the command line… My best guess is that you’re using my server’s username/password, but it could be other things - you’ll see them if you use the exact same curl command from a prompt.

curl -u yourusername:yourpassword -L -H 'Accept: text/turtle' -H 'Content-type: text/turtle' -H 'Slug: put_a_name_here' --data-binary @test.ttl 'ttp://localhost:8890/DAV/ldp/'