Create a Virtuoso RDF Graph Replication Subscriber (Slave) Node from a Publisher Master Node

What

Create a Virtuoso RDF Graph Replication Subscriber (Slave) node from a Publisher (Master) node.

Why

There are situations where a Subscriber node fails and needs to be rebuilt or a new Subscriber node needs to be added an existing replication setup.

If the Publisher node is known to contain the current state of the database, Subscriber nodes can be created from a copy of the Publisher thereby resuming high-availability transactional replication from an up-to-date state from that point onwards.

How

  1. Stop/disable all updates to the publisher so its state remains constant when setting up the new subscriber(s)

  2. Run a checkpoint to commit any pending updates in the trx file to the database (db) file

  3. On each subscriber node, if database files already exist, remove the following to ensure no remnants of the older subscriber node database exists:

   rm virtuoso.db virtuoso.trx virtuoso.log virtuoso.pxa 
  1. Copy the publisher database file virtuoso.db to the subscriber node(s) using rsync or preferred copy method

  2. Start the database server on the subscriber node and perform the following steps to remove publisher node information from database:

   delete from sys_repl_accounts; 
   delete from sys_repl_subscribers; 
   registry_remove ('DB.DBA.RDF_REPL'); 
   shutdown(); 
  1. Restart the database with virtuoso-start.sh and run the following commands to set it up as a subscriber node:
   repl_server ('MASTER', '{MASTER_DSN}');  
   repl_subscribe ('MASTER', '__rdf_repl', 'dav', 'dav', 'dba', 'dba');  
   shutdown(); 
  1. Restart the database again with virtuoso-start.sh and run the following command to continue setting it up as a subscriber node
   repl_sync_all ();  
   DB.DBA.SUB_SCHEDULE ('MASTER', '__rdf_repl', 1); 
  1. Run the following status commands to check it is in sync with the Master:
   status(); 
   repl_stat(); 

Troubleshooting

There maybe occasions when Slave nodes are failing to sync with the Master with the following situations reported:

  • The Subscriber remains constantly in the replication state of SYNCING with the Master as reported by the output of running the rep_stat() command on it:
SQL> repl_stat(); 
server   account  level       stat 
VARCHAR  VARCHAR  INTEGER     INTEGER 
_______________________________________________________________________________ 

MASTER   __rdf_repl  939         SYNCING 
SLAVE-1  SLAVE-1      0           OFF  

SQL>
  • The Master report a __rdf_repl' is not valid ... error in the database log file:
14:19:20 Subscriber 'SLAVE-1' for '__rdf_repl' is not valid (level 2, requested level 662)

As can be seen from the errors above the Master is at replication level 2, but the Subscriber is requesting a sync from replication level 662, which is clearly invalid, as the Subscriber should always be one replication level less than the Master i.e. 1 in this case.

Thus the replication level on the Master needs to be verified and the Subscriber set to be one less than it using the database sequence_set() function as follows:

  1. On the publisher determine the replication server name with the following command:
SQL> select repl_this_server(); 
repl_this_server 
VARCHAR 
_______________________________________________________________________________ 

MASTER 

1 Rows. -- 1 msec. 
SQL>  
  1. On the publisher determine the internal replication server name to pass to the set_sequence() function with the command:
SQL> select concat ('repl_', repl_this_server(), '_', '__rdf_repl'); 
concat 
VARCHAR 
_______________________________________________________________________________ 

repl_MASTER___rdf_repl 

1 Rows. -- 1 msec. 
SQL> 
  1. Check the publisher replication level with the command:
SQL> select sequence_set ('repl_MASTER___rdf_repl', 0, 2); 
sequence_set 
VARCHAR 
_______________________________________________________________________________ 

1685 

1 Rows. -- 0 msec. 
SQL> 
  1. Switch to the subscriber and check the publisher replication level store:
SQL> select sequence_set ('repl_MASTER___rdf_repl', 0, 2); 
sequence_set 
VARCHAR 
_______________________________________________________________________________ 

939 

1 Rows. -- 0 msec. 
SQL> 
  1. Set the publisher replication level on the subscriber to be one less than its level on the publisher itself with command:
SQL> sequence_set ('repl_MASTER___rdf_repl', 1684, 0); 

Done. -- 0 msec. 

SQL> 
  1. Check the publisher replication level on the subscriber is one less than on the publisher:
SQL> select sequence_set ('repl_MASTER___rdf_repl', 0, 2); 
sequence_set 
VARCHAR 
_______________________________________________________________________________ 

1684 

1 Rows. -- 0 msec. 
SQL> 
  1. Back on the Master, check Subscriber replication status is valid by ensuring its RS_VALID column of the SYS_REPL_SUBSCRIBERS table is set 1 and the RS_LEVEL column is set to one less than the replication level of the subscriber:
SQL>   select * from SYS_REPL_SUBSCRIBERS; 
RS_SERVER          RS_ACCOUNT         RS_SUBSCRIBER     RS_LEVEL    RS_VALID 
VARCHAR NOT NULL   VARCHAR NOT NULL   VARCHAR NOT NULL  INTEGER     INTEGER 
_______________________________________________________________________________ 

MASTER             __rdf_repl         SLAVE-1           939         0 

1 Rows. -- 49 msec. 

SQL>
  1. If as above the RS_VALID is 0 then set it to 1 and set the RS_LEVEL column to one less than the replication level of the Master as follows to activate replication from the Slave node:
SQL> update SYS_REPL_SUBSCRIBERS set RS_LEVEL = 1684 , RS_VALID = 1 ;

0 Rows. -- 2 msec.
SQL>
SQL> select * from SYS_REPL_SUBSCRIBERS; 
RS_SERVER          RS_ACCOUNT         RS_SUBSCRIBER     RS_LEVEL    RS_VALID 
VARCHAR NOT NULL   VARCHAR NOT NULL   VARCHAR NOT NULL  INTEGER     INTEGER 
_______________________________________________________________________________ 

MASTER             __rdf_repl         SLAVE-1           1684        1 

1 Rows. -- 49 msec. 

SQL>
  1. Then restart Master and the subscriber for the settings to take effect, and run the following on each subscriber to force them to sync:
    repl_sync_all (); 
  1. Finally check the Subscriber node(s) is now IN SYNC and that there are no occurrences of the __rdf_repl' is not valid ... errors in the Master database log.

Related