Trying to alter primary key index in RDF_QUAD when clustering with 4 nodes

eltajj18 · May 6, 2025, 6:35pm

Hi,
I managed to connect 4 machines to each other and set one as master, and now I am trying to create _ALL cluster, I haven’t loaded my data yet, I am trying to alter primary key to be _ALL cluster with this command as it is shown in the guide (https://docs.openlinksw.com/virtuoso/clusteroperationpart/).

alter index RDF_QUAD on DB.DBA.RDF_QUAD cluster replicated;

before this I tried with existing databases and thought that might be the problem but the same thing happened with empty databases.
Also after altering index how can I load my data into cluster nodes, should I ld_dir in each node then run cl_exec in master or?

Thanks in advance,
Eltaj

hwilliams · May 7, 2025, 1:30pm

What Virtuoso version are you using, as you seem to be attempting to configure a Virtuoso scale out cluster , which is not supported in the latest Virtuoso 8.x releases.

What you use case that it is felt a scale out cluster is required, that cannot be achieved with a single server RDF Graph Replication Cluster configuration ?

eltajj18 · May 7, 2025, 10:54pm

Hi,

I am using Virtuoso 8.3.
I am trying to increase the query performance overall. And I have done few optimization regarding single server, the method you mentioned “RDF graph replication cluster configuration”, is this used for the purpose I mentioned or just backing up data for security.
Also would changing the virtuoso version solve my problem in my case?

Regards,
Eltaj

eltajj18 · May 7, 2025, 10:56pm

Also, I am trying to see how the query runtime works in distributed setting for my project, that is why I was asking about clustering. Can you please help me how to make it work?

hwilliams · May 8, 2025, 9:11am

What is the purpose of your project is this meant for production use or just research/testing ?

If you are insistent on using a scale-out configuration, this is available in the Virtuoso 7.2 (not 8.3) release, using the Virtuoso Elastic Cluster Installation & Configuration instructions, which should setup all the required indexes as part of the process, without the need to be creating any custom or additional indexes which is not advised.

eltajj18 · May 8, 2025, 12:42pm

It is part of my thesis to optimize the query efficiency.

eltajj18 · May 9, 2025, 4:58pm

Is it possible to partition my data to two and run them in 2 separate Virtuoso (Enterprise edition) databases with virtuoso-start.sh or they would stay separate?

hwilliams · May 9, 2025, 8:39pm

Your question is not clear …

In what way are you seeking to partition the data and run/load/query on 2 separate Virtuoso databases ?

By separate Virtuoso databases do you mean separate Virtuoso single server databases or 2 Virtuoso database in a scale-out configuration, as you were seeking to do previously ?