Content Index
- Introduction
- Prerequisites
- Deployment via Azure Marketplace Offer Page
- DBpedia Snapshot (Virtuoso PAGO) Database Interaction via Web Interface
- Performance Tuning
- Troubleshooting
- Additional Information Sources
Introduction
What is this offering about?
Cloud Hosted Pay-As-You-Go (PAGO) edition of a preconfigured Virtuoso instance that includes a pre-loaded an optimized DBpedia 2022-12 Snapshot Edition Knowledge Graph.
Who is this for?
Any architect, systems integrator, or developer that seeks an application- or service-specific instance of DBpedia for high-performance and scalable interaction with its powerful Knowledge Graph.
What need is addressed by this offering?
Access to the DBpedia Knowledge Graph without the constraints enforced by the “Fair Use” policy of the general public instance.
Prerequisites
- An Azure Cloud subscription account.
Deployment via Azure Marketplace Offer Page
- From the Azure Marketplace search for the keyword
DBpedia
to locate the available DBpedia offering and select the PAGO offer:
- Click on the
GET IT NOW
button to start the subscription and deployment of the DBpedia PAGO offer:
- Select
Continue
to start the deployment process in the Azure Portal.
- Select the
Create
button to start the VM configuration manually or select theStart with a pre-set configuration
to start the a pre set configuration.
- From the
Basics
tab choose theResource group
, which can be an existing or new group; set theVirtual machine name to call the deployment; set the
Regionto deploy from; for
SSH public key source` choose to either use and existing or public key to be used.
- From the Disks tab, typically the defaults can be used, or additional disks can be added as required.
- From the Networking tab, typically the defaults can be used.
- For the remaining
Management
,Advanced
andTags
tabs the defaults can also be used and theCreate + Review
button selected to validate the deployment
- Once the Validation passed message occurs, click on the
Create
button to start the deployment
- The
Your deployment is complete
message is display once successfully completed.
- Click on the
Goto resource
button to load theOverview
page of the successfully deployed VM, from which thePublic IP address
can be copied for use to access the VM viassh
andhttp
.
DBpedia Snapshot (Virtuoso PAGO) Database Interaction via Web Interface
Once online, your DBpedia Snapshot instance will be ready for use from —
- Basic Linked Data Exploration Page — an obvious starting point
http://{azure-cloud-vm-dns-name-or-ip-address}/resource/DBpedia
- Faceted Browsing Endpoint
http://{azure-cloud-vm-dns-name-or-ip-address}/fct
- Advanced Faceted Browsing Page
http://{azure-cloud-vm-dns-name-or-ip-address}/describe/?uri=http://dbpedia.org/resource/DBpedia
- SPARQL Query Service Endpoint
http://{azure-cloud-vm-dns-name-or-ip-address}/sparql
- Virtuoso Instance Administration Page (Virtuoso Conductor)
http://{azure-cloud-vm-dns-name-or-ip-address}/conductor
Administering the Virtuoso Instance via SSH
- Make a
ssh
connection to the VM using the public key (pem-file
) and username (ubuntu
by default) chosen when creating the deployment, and thePublic IP address
from the previous section as follows:
ssh -i {pem-file} azureuser@{Public IP address}
- Once connected it is strongly recommended to update the VM to get the latest operating system and Virtuoso updates with the command:
sudo apt-get upgrade
- Check the Virtuoso server is automatically started post deployment with the command:
sudo service virtuoso status
- The following commands can be used to Administer the Virtuoso server:
- Start the Virtuoso Server:
sudo service virtuoso start
- Stop the Virtuoso Server:
sudo service virtuoso stop
- Restart the Virtuoso Server:
sudo service virtuoso restart
- Check status of Virtuoso Server:
sudo service virtuoso status
- Determine the random password set for the
dba
user with the command:
sudo cat /opt/virtuoso/database/.initial-password
- A
SQL
connection can then be made Virtuoso with theisql
command line tool with the command on port1111
:
isql 1111
- Typical output for running these steps are:
$ ssh -i certificates/virtuoso.pem ubuntu@54.221.25.206
The authenticity of host '54.221.25.206 (54.221.25.206)' can't be established.
ECDSA key fingerprint is SHA256:QGsOFcQoa4x5DBavtdHWDQUUQtBdHJ/OkizKep8UOcM.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '54.221.25.206' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-1025-aws x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Fri Jan 29 12:41:03 UTC 2021
System load: 0.0 Processes: 104
Usage of /: 2.0% of 116.27GB Users logged in: 0
Memory usage: 4% IP address for eth0: 10.0.0.214
Swap usage: 0%
* Canonical Livepatch is available for installation.
- Reduce system reboots and improve kernel security. Activate at:
https://ubuntu.com/livepatch
9 packages can be updated.
0 updates are security updates.
Last login: Tue Sep 22 19:26:19 2020 from 108.26.205.225
ubuntu@ip-10-0-0-214:~$ cd /opt/virtuoso/database
ubuntu@ip-10-0-0-214:/opt/virtuoso/database$ sudo bash
root@ip-10-0-0-214:/opt/virtuoso/database# cat .initial-password
i-0343ad51fe5e4f196
root@ip-10-0-0-214:/opt/virtuoso/database# service virtuoso status
● virtuoso.service - OpenLink Virtuoso Database
Loaded: loaded (/lib/systemd/system/virtuoso.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2021-01-29 12:04:31 UTC; 38min ago
Process: 878 ExecStart=/opt/virtuoso/bin/virtuoso-start.sh $VIRTUOSO_DB_NAMES (code=exited, status=0/SUC
Main PID: 1170 (virtuoso)
Tasks: 15 (limit: 4915)
CGroup: /system.slice/virtuoso.service
└─1170 ./virtuoso
Jan 29 12:04:25 ip-10-0-0-214 systemd[1]: Starting OpenLink Virtuoso Database...
Jan 29 12:04:26 ip-10-0-0-214 virtuoso-start.sh[878]: Starting Virtuoso instance in [database]
Jan 29 12:04:26 ip-10-0-0-214 virtuoso-start.sh[878]: - Starting the database
Jan 29 12:04:31 ip-10-0-0-214 systemd[1]: Started OpenLink Virtuoso Database.
root@ip-10-0-0-214:/opt/virtuoso/database# /opt/virtuoso/bin/isql 1111
OpenLink Virtuoso Interactive SQL (Virtuoso)
Version 08.03.3319 as of Sep 1 2020
Type HELP; for help and EXIT; to exit.
Enter password for dba :
Connected to OpenLink Virtuoso
Driver: 08.03.3319 OpenLink Virtuoso ODBC Driver
SQL> status('');
REPORT
VARCHAR
_______________________________________________________________________________
OpenLink Virtuoso VDB Server
Version 08.03.3319-pthreads for Linux as of Sep 1 2020
Started on: 2021-01-29 12:45 GMT+0
CPU: 0.05% RSS: 148MB PF: 0
Database Status:
File size 67108864, 8192 pages, 5733 free.
20000 buffers, 1115 used, 85 dirty 0 wired down, repl age 0 0 w. io 0 w/crsr.
Disk Usage: 1074 reads avg 0 msec, 0% r 0% w last 23 s, 138 writes flush 0 MB/s,
34 read ahead, batch = 17. Autocompact 0 in 0 out, 0% saved.
Gate: 166 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap.
Log = virtuoso.trx, 8325 bytes
VDB: 0 exec 0 fetch 0 transact 0 error
2309 pages have been changed since last backup (in checkpoint state)
Current backup timestamp: 0x0000-0x00-0x00
Last backup date: unknown
Clients: 1 connects, max 1 concurrent
RPC: 6 calls, 1 pending, 1 max until now, 0 queued, 0 burst reads (0%), 0 second 0M large, 10M max
Checkpoint Remap 38 pages, 0 mapped back. 0 s atomic time.
DB master 8192 total 5733 free 38 remap 1 mapped back
temp 256 total 251 free
Lock Status: 0 deadlocks of which 0 2r1w, 0 waits,
Currently 1 threads running 0 threads waiting 0 threads in vdb.
24 Rows. -- 2 msec.
SQL>
Performance Tuning
There are a range of Azure VM instance types with different system memory and CPU combinations. Collectively, the factors above affect the performance of your Virtuoso instance. Thus, use Azure VM Instance Type
s with more memory and CPU cores for best performance.
Note: This VM is configured to use minimal system memory. For the instance type chosen, the NumberOfBuffer
and MaxDirtyBuffers
parameters in the /opt/virtuoso/database/virtuoso.ini
configuration file should be increased to match the available memory, as detailed in the Virtuoso Performance Tuning Guide, for example –
VM Instance Type | System RAM | Number Of Buffers | Max Dirty Buffers |
---|---|---|---|
B2MS |
8 GB | 680000 | 500000 |
B4MS |
15 GB | 1360000 | 1000000 |
M32S |
32 GB | 2720000 | 2000000 |
M64LS |
64 GB | 5450000 | 4000000 |
– and the Virtuoso server restarted as detailed above.
Extrapolate the NumberOfBuffer
and MaxDirtyBuffers
parameters accordingly for different sized VMs.
Troubleshooting
If the Virtuoso server fails to start:
- Run the command
sudo service virtuoso status
to see if the Virtuoso server is running - Check the
/opt/virtuoso/database/virtuoso.log
file to see why the server might have failed to start - Ensure there file
/opt/virtuoso/database/virtuoso.lck
does not exist before starting the server - Attempt to start the Virtuoso server with the command
sudo service virtuoso start
- Run the command
sudo service virtuoso status
again to see if the Virtuoso server is running - If it is now running attempt a connect via the
SQL
orHTTP
interfaces are detailed above
Additional Information Sources
- DBpedia Snapshot (Virtuoso PAGO) EBS-backed EC2 AMI
- Protecting your Virtuoso-hosted SPARQL Endpoint
- Virtuoso documentation
- Virtuoso Tips and Tricks