Size of each RDF index scheme on disk

Hi,
If I am not mistaken, Virtuoso v. 7+ creates four RDF index schemes while loading an RDF dataset by default. I am interested to know where the corresponding physical file of each index scheme (e.g., SP, PSOG, etc.) is stored in Linux by default to check the size of each of these indexes on the disk in my applications. Thank you.

Hi Pouya,

Virtuoso stores all tables, indexes, procedures, blobs, etc., in a single .db file on disk, stored in blocks of 8KB. Whenever Virtuoso runs low on pages, it extends the database by a configurable number of pages. When pages have been allocated and released, they go onto the “free page” list, and do not shrink the .db file.

The isql tool can be used to show this information:

$ isql 1111
OpenLink Virtuoso Interactive SQL (Virtuoso)
Version 08.03.3317 as of Jun 17 2020
Type HELP; for help and EXIT; to exit.
Connected to OpenLink Virtuoso
Driver: 08.03.3318 OpenLink Virtuoso ODBC Driver
SQL> status('');
REPORT
VARCHAR
_______________________________________________________________________________

OpenLink Virtuoso VDB Server
Version 08.03.3318-pthreads for Linux as of Aug  4 2020 
Registered to OpenLink Software (INTERNAL USE ONLY) (Personal Edition, 500 connections)
Started on: 2020-08-05 12:22 GMT+2
CPU: 0.05% RSS: 159MB PF: 3251
 
Database Status:
  File size 224395264, 27392 pages, 16091 free.
  20000 buffers, 2390 used, 3 dirty 0 wired down, repl age 0 0 w. io 0 w/crsr.
  Disk Usage: 2766 reads avg 0 msec, 0% r 0% w last  0 s, 1149 writes flush      1.709 MB/s,
    66 read ahead, batch = 24.  Autocompact 0 in 0 out, 0% saved.
Gate:  306 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap. 
Log = virtuoso.trx, 1827 bytes
VDB: 0 exec 0 fetch 0 transact 0 error
11258 pages have been changed since last backup (in checkpoint state)
Current backup timestamp: 0x0000-0x00-0x00
Last backup date: unknown
Clients: 2 connects, max 1 concurrent
RPC: 10 calls, 0 pending, 1 max until now, 0 queued, 0 burst reads (0%), 0 second 0M large, 10M max
Checkpoint Remap 0 pages, 0 mapped back. 2 s atomic time.
    DB master 27392 total 16091 free 0 remap 0 mapped back
   temp  256 total 251 free
 
Lock Status: 0 deadlocks of which 0 2r1w, 0 waits,
   Currently 1 threads running 0 threads waiting 0 threads in vdb.

25 Rows. -- 26 msec.

So, in this example, my `virtuoso.db file is:

$ ls -l virtuoso.db
-rw-r--r-- 1 patrick patrick 224395264 Aug  6 10:23 virtuoso.db

That shows the same as the following line in the status(); output:

  File size 224395264, 27392 pages, 16091 free.

where 224395264 (size in bytes) / 8192 (page size in bytes) = 27392 total pages

As there are 16091 pages free at this point, that means that 27392 - 16091 = 11301 pages are actually in use at this time.

Note that these values can also be retrieved using the sys_stat function, as in the following example:

SQL> select sys_stat ('st_db_pages') as "DB pages", sys_stat ('st_db_free_pages') as "Free Pages";
DB pages         Free Pages
LONG VARCHAR     LONG VARCHAR
____________     ____________

27392            16091

1 Rows. -- 1 msec.

Let us know if this answers your question.

1 Like

Hi Patrick,
Thank you for your reply. It seems that it is not easily possible to have a size-on-disk breakdown (for each index scheme, etc.) since all information are stored/compressed into a single file. This is interesting. Thank you once again.