Controlling the Database Working Set in OpenLink Virtuoso

Frozen Data Flows Slowly

Immediately upon launch, a database is considered to be in a “cold” state, as most if not all data is “frozen” in disk storage. The database “working set” or “cache” — that is, the portion of data in active memory — is near empty, and all queries require reading from disk, which is one of the slowest functions of any DBMS. Queries executed shortly following a database restart are thus among the slowest to return their results.

As the DBMS responds to more queries, more data is read into active memory, and the database gradually shifts into a “warm” state. The more data is in the working set, the “warmer” the database, and the more quickly queries return.

Thawing the Data Stream

Historically, the best way to bring a re-launched Virtuoso instance from “cold” to “warm” was to simulate user activity by executing a number of canned queries, and thereby reading data from disk into active memory.

Recent releases of Virtuoso 8.0 (and some partner-specific builds of Virtuoso 7.5) have new features that enable the quick reload of a previously-saved working set into memory. This can put the Virtuoso database into a “warm” state near-immediately following launch, enabling the typical query workload to execute against the database with optimum performance.

Controlling Virtuoso’s Database Working Set

Two new SQL functions are used to record and restore a database instance’s working set. These are especially useful for repeatability of benchmark tests, since a medium-to-large database can take 30 minutes or longer to reach steady-state with a typical look-up workload.

ws_save() writes the set of database pages currently in RAM into a file in the server’s working directory.

ws_restore() restores the previously-saved working set. The working set is loaded in bulk and in the background, so this is typically faster than running the application up to the point of steady state. Applications can run while the working set is being restored.

When ws_restore() returns, the database pages that were in memory at the time of ws_save() will again be in memory. Some database pages which were present at the time of ws_save() may not exist at the time of ws_restore() , especially if the database was stopped without a checkpoint and started without a roll-forward; these will simply be ignored.

Related