If I am not mistaken, Virtuoso v. 7+ creates four RDF index schemes while loading an RDF dataset by default. I am interested to know where the corresponding physical file of each index scheme (e.g., “SP” or “PSOG” or etc.) is stored in Linux by default to check the size of each of these indexes on the disk in my applications. Thank you.
Virtuoso stores all tables, indexes, procedures, blobs etc in a single .db file on disk , stored in blocks of 8KB. Whenever VIrtuoso runs low on pages it extends the database by extending the database with a configurable number of pages. When pages have been allocated and released, they go onto the “free page” list and not shrink the .db file.
The isql tool can be used to show this information:
$ isql 1111 OpenLink Virtuoso Interactive SQL (Virtuoso) Version 08.03.3317 as of Jun 17 2020 Type HELP; for help and EXIT; to exit. Connected to OpenLink Virtuoso Driver: 08.03.3318 OpenLink Virtuoso ODBC Driver SQL> status(''); REPORT VARCHAR _______________________________________________________________________________ OpenLink Virtuoso VDB Server Version 08.03.3318-pthreads for Linux as of Aug 4 2020 Registered to OpenLink Software (INTERNAL USE ONLY) (Personal Edition, 500 connections) Started on: 2020-08-05 12:22 GMT+2 CPU: 0.05% RSS: 159MB PF: 3251 Database Status: File size 224395264, 27392 pages, 16091 free. 20000 buffers, 2390 used, 3 dirty 0 wired down, repl age 0 0 w. io 0 w/crsr. Disk Usage: 2766 reads avg 0 msec, 0% r 0% w last 0 s, 1149 writes flush 1.709 MB/s, 66 read ahead, batch = 24. Autocompact 0 in 0 out, 0% saved. Gate: 306 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap. Log = virtuoso.trx, 1827 bytes VDB: 0 exec 0 fetch 0 transact 0 error 11258 pages have been changed since last backup (in checkpoint state) Current backup timestamp: 0x0000-0x00-0x00 Last backup date: unknown Clients: 2 connects, max 1 concurrent RPC: 10 calls, 0 pending, 1 max until now, 0 queued, 0 burst reads (0%), 0 second 0M large, 10M max Checkpoint Remap 0 pages, 0 mapped back. 2 s atomic time. DB master 27392 total 16091 free 0 remap 0 mapped back temp 256 total 251 free Lock Status: 0 deadlocks of which 0 2r1w, 0 waits, Currently 1 threads running 0 threads waiting 0 threads in vdb. 25 Rows. -- 26 msec.
So in this example my virtuoso.db file is
$ ls -l virtuoso.db -rw-r--r-- 1 patrick patrick 224395264 Aug 6 10:23 virtuoso.db
Which is the same as the following line in the status output:
File size 224395264, 27392 pages, 16091 free. where 224395264 (size in bytes) / 8192 (page size in bytes) = 27392 total pages As there are 16091 pages free at this point, that means that 27392 - 16091 = 11301 pages actually in use at this time.
Note that these values can also be retrieved using the sys_stat function as in the following example:
SQL> select sys_stat ('st_db_pages') as "DB pages", sys_stat ('st_db_free_pages') as "Free Pages"; DB pages Free Pages LONG VARCHAR LONG VARCHAR _______________________________________________________________________________ 27392 16091 1 Rows. -- 1 msec.
Let us know if this answers your question.
Thank you for your reply. It seems that it is not easily possible to have a size-on-disk breakdown (for each index scheme, etc.) since all information are stored/compressed into a single file. This is interesting. Thank you once again.