Understanding Page Sizing of Striped Virtuoso Databases

Understanding Page Sizing of Striped Virtuoso Databases

When Virtuoso is started, it checks the Striping parameter in the [Database] section of the virtuoso.ini configuration file to see whether it’s a “normal” single database file, or a Striped set of database files should be created. When Striping = 1, Virtuoso then looks to the [Striping] section, checks whether the segments defined there already exist, and creates them if not.

For example, consider the following entries in a virtuoso.ini configuration file:

[Database]
FileExtend = 20000
Striping   = 1

[Striping]
Segment1 = 100M, 
           /data1/virtuoso-1.db = ioq1 ,
           /data2/virtuoso-2.db = ioq2 ,
           /data3/virtuoso-3.db = ioq3 ,
           /data4/virtuoso-4.db = ioq4 ,
           /data5/virtuoso-5.db = ioq5 ,
           /data6/virtuoso-6.db = ioq6 ,
           /data7/virtuoso-7.db = ioq7

On first start, Virtuoso detects that the database stripes do not exist yet, so it looks at the initial size of the database which is here set to 100M; that is, 100MB * 1024KB/MB * 1024B/KB = 104,857,600 bytes.

Virtuoso has a default PAGE_SZ of 8192 bytes, so this means 104,857,600B ÷ 8192B/PAGE = 12,800 pages.

When using multiple stripes, Virtuoso requires that the number of pages be evenly divisible by EXTENT_SZ * number of stripes. With EXTENT_SZ = 256 and the number of stripes = 7, the divisor is calculated to be 256 * 7 = 1792. However, 12,800 ÷ 1792 = 7.142 which is not a whole number — i.e., the division is not even — so Virtuoso increases the page count to be based on the next-higher multiplier; that is, 7.1928, so 1792 * 8 = 14,336 pages.

In the log, Virtuoso then reports it created a slightly bigger initial database, and proceeds to create the 7 stripes for the database, with 14,336 ÷ 7 = 2048 pages per stripe, and continues on to initialize the database, etc.

If and when all the free pages in the database get filled with data, Virtuoso will “grow” the database with the FileExtend (here, 20000) number of pages — or actually, 21,504 pages, that being the nearest multiple of the divisor calculated earlier (1792).

Note that if the rough size of the database to be created is known, the first attribute of the Segment1 param in the [Striping] section of the virtuoso.ini (100M, above) should be set to this size. The Virtuoso striped database will then be initialized at that size, and Virtuoso will not have to periodically halt all threads and “grow” the database to obtain free pages as the data is loaded, until the initial “fully-loaded” size is reached.

Related

1 Like