Hello,
I’m using Virtuoso for an experiment on DBpedia dataset. For my experiment, I need to do many queries similar to:
SELECT DISTINCT ?s ?p ?o FROM http://dbpedia.org WHERE {?s ?p ?o} LIMIT 5000 OFFSET 45225000
where the offset increases. The goal is obtain all the triples of DBpedia in batches of 5000 triples each. I noticed that after 32 minutes of processing, Virtuoso was retrieving around 437500 triples per minute. However, after 6 hours the performance has decreased to 125000 triples per minute. I’m running Virtuoso on a node of a HPC cluster with 36 cores and 192 GB of RAM. Is there something that I can do to avoid this performance decrease? Please, find my virtuoso.ini file here.