What options are available for integrating these

Hello,

I have multiple sources of information I would like to ingest. The information is related and available in multiple formats such as from
http://cwe.mitre.org and http://capec.mitre.org or
https://cyboxproject.github.io

I think a simple example would http://cve.mitre.org which as a download section cve-website (I’ve tried importing the csv but errors for records to long). For this example what would be the best way to import the cve’s and configure to keep up to date incrementally?

Hi,

Can I ask a little more background info please?
I assume from previous threads you’re using Virtuoso Commercial Edition?
How were you trying to import the CSV - Conductor, bulk-importer, CSV Attach or some other way?
Would you like to continue using CSV or (since CVE seems to be in a state of flux) would you move to another method?

Thanks

Sure,

I was using the enterprise version but the evaluation license grace period passed so I’m using VOS.

I’ve been using the bulk importer but I’m in no way married to CSV.

Using [ conductor | database | import ] by URL and this:

https://cve.mitre.org/data/downloads/allitems.csv

My thinking was this provides everything up to a current state and only require bringing new entries in as a feed (or daily batch).

Moving to VOS (as far as I can remember) doing the same thing the csv bulk upload seems to have broken the “text search” in my fct

“Could not process your request because of an unexpected error.

Diagnostics

SQLSTATE: 42001

SQLMSG : SR185: Undefined procedure DB.DBA.xenc_digest.

More info…

SPARQL: select xmlelement (‘result’, xmlattributes (‘text-d’ as “type”), “res”) from (sparql select (sql:s_sum_page (sql:vector_agg (bif:vector (?c1, ?sm, ?g1)), bif:vector (‘cve’))) as ?res where { { select (<SHORT_OR_LONG::>(?s1)) as ?c1, (sql:S_SUM ( <SHORT_OR_LONG::IRI_RANK> (?s1), <SHORT_OR_LONG::>(?s1textp), <SHORT_OR_LONG::>(?o1), ?sc ) ) as ?sm, <SHORT_OR_LONG::>(?g) as ?g1 where { quad map virtrdf:DefaultQuadMap { graph ?g { ?s1 ?s1textp ?o1 . ?o1 bif:contains ‘“cve”’ option (score ?sc) . } } }order by desc (sql:sum_rank ((sql:S_SUM ( <SHORT_OR_LONG::IRI_RANK> (?s1), <SHORT_OR_LONG::>(?s1textp), <SHORT_OR_LONG::>(?o1), ?sc ) ) ) ) limit 50 offset 0 }}) xx option (quietcast)

Permalink

STATE: <?xml version="1.0" encoding="UTF-8" ?>cve

Any advice?

Thanks for this.
OK, so the mixed news:

  • I can produce bugs in VOS - Conductor gives errors in the early stages of import
  • it works in Virtuoso Commercial Edition if you bump TransactionAfterImageLimit = 500000000 in virtuoso.ini (ie 100x larger than default) and change the text-column data-types to LONG VARCHAR.

I’ve raised a bug about VOS so Development will look into it soon.

Thanks! That worked great but it did seem to cause an issue I seem to have pretty often. Its with the faceted search (text) . I was wondering if may know to fix this:

Could not process your request because of an unexpected error.

Diagnostics

SQLSTATE: 37000

SQLMSG : SQ074: Line 6: SP031: SPARQL compiler: Free-text options can be specified only for triple patterns with special predicates, not for plain patterns at ‘)’ before ‘xx’

More info:

SPARQL: select xmlelement (‘result’, xmlattributes (‘text-d’ as “type”), “res”) from (sparql define input:ifp “IFP_OFF” select (sql:s_sum_page (sql:vector_agg (bif:vector (?c1, ?sm, ?g1)), bif:vector (‘CVE’))) as ?res where { { select (<SHORT_OR_LONG::>(?s1)) as ?c1, (sql:S_SUM ( <SHORT_OR_LONG::IRI_RANK> (?s1), <SHORT_OR_LONG::>(?s1textp), <SHORT_OR_LONG::>(?o1), ?sc ) ) as ?sm, <SHORT_OR_LONG::>(?g) as ?g1 where { quad map virtrdf:DefaultQuadMap { graph ?g { ?s1 ?s1textp ?o1 . ?o1 bif:contains ‘(CVE)’ option (score ?sc) . } } }order by desc (sql:sum_rank ((sql:S_SUM ( <SHORT_OR_LONG::IRI_RANK> (?s1), <SHORT_OR_LONG::>(?s1textp), <SHORT_OR_LONG::>(?o1), ?sc ) ) ) ) limit 50 offset 0 }}) xx option (quietcast)

Permalink

STATE: <?xml version="1.0" encoding="UTF-8" ?>cve

Clicking on your Permalink gives and message The requested active content cannot be displayed due to execution restriction I can’t even go to http://lod.greyside.net/fct/ I get a browser 404 error The requested URL was not found URI = '/fct/' so I am not sure how you are exposing the endpoints externally ?

Sorry, should be accessible now:

http://lod.greyside.net/fct/facet.vsp?cmd=text&sid=5

Could not process your request because of an unexpected error.

Diagnostics
SQLSTATE: 37000

SQLMSG  : SQ074: Line 6: SP031: SPARQL compiler: Free-text options can be specified only for triple patterns with special predicates, not for plain patterns at ')' before 'xx'

More info…
SPARQL:
select  xmlelement ('result', 
	  			     xmlattributes ('text-d' as "type"), 
				     "res") 
				     from (sparql      define input:ifp "IFP_OFF"  select 
		  	(<sql:s_sum_page> (<sql:vector_agg> (<bif:vector> (?c1, ?sm, ?g1)), <bif:vector> ('CVE'))) as ?res where { { 
      select (<SHORT_OR_LONG::>(?s1)) as ?c1,  (<sql:S_SUM> ( <SHORT_OR_LONG::IRI_RANK> (?s1), <SHORT_OR_LONG::>(?s1textp), <SHORT_OR_LONG::>(?o1), ?sc ) ) as ?sm, <SHORT_OR_LONG::>(?g) as ?g1 where  { quad map virtrdf:DefaultQuadMap { graph ?g {  ?s1 ?s1textp ?o1 . ?o1 bif:contains  '(CVE)'  option (score ?sc)  . } }  }order by desc (<sql:sum_rank> ((<sql:S_SUM> ( <SHORT_OR_LONG::IRI_RANK> (?s1), <SHORT_OR_LONG::>(?s1textp), <SHORT_OR_LONG::>(?o1), ?sc ) ) ) ) limit 50  offset 0 }}) xx option (quietcast)

Permalink

STATE:
<?xml version="1.0" encoding="UTF-8" ?><query inference="" invfp="IFP_OFF" same-as="SAME_AS_OFF" view3="" s-term="e" c-term="type" agg="" limit="50"><text>cve</text><view type="text-d" limit="50" offset="" /></query>

Please share the Faceted Browser link via the permalink feature. The link you shared in your response will not resolve.

Hi,

We have recently added a new cartridge to our Sponger / Cartridges VAD package, called CVE v5, to handle their new upstream JSON format.

This would be one way to ingest CVE data. The Sponger can also be invoked from the Web Crawler Bot as another way to automate the process.

Examples:

Cartridges requires use of Virtuoso Commercial Edition, so you might want to talk with Sales about the state of licensing.

HTH :slight_smile:

1 Like

Nice! - went from thinking of a cobbled together import of legacy or messy sorting from their twitter feed to this is huge! Owe you some drinks!

Hate to ask another question of you but out of curiosity: several reasons exist that require this use the commercial version. When you mention this is needed to make use of the cartridges what features are lost when the cartridges VAD is installed on VOS?

Looking at it i figured the VAL Oath would be lost but this appears optional.


^^ after cartridge vad install on vos

This may not be what you meant when referencing the bot and may clear up something I’m super confused about:

This has been an ongoing rabbit hole for me because it the setup guide begins stating it’s after having the sources (my rabbit hole has been locating them) and access control via ACL / WebID with delegation and haven’t found anything about VAL or requiring the commercial version.

https://osdb.openlinksw.com/osdb/doc/config_guide.html#setup

Is the osdb included with the commercial version (the guide is a little confusing to me)

sudo su -
screen -S osdb_srv
cd osdb
./startup_linux_https_uriburner.sh
Ctrl-a d

That VAL OAuth Sponger configuration page should not be available in the open source product it does not support VAL, and will be removed.

OSDB is not included in the commercial version and that start script/commans in the guide is incorrect and needs to be cleaned up.

1 Like