Virtuoso Data Virtualization Use-Case -- Integrating Data Across a Federation of External Data Sources

This document provides a collection of Virtuoso use-case scenarios that showcase Virtuoso’s Knowledge Graph Platform functionality using data from local, federated and virtualized sources. Naturally, data access is handled via open standards based protocols (e.g., ODBC, JDBC, and HTTP(s)), enabling instant compatibility with compliant tools and environments, such as:

  • Tableau
  • PowerBI
  • Neo4j
  • GraphQL
  • Various Programming Languages & Frameworks

Part 1: Creating a Knowledge Graph from Virtualized CSV Datasets

Target Audience: Knowledge Graph Developers/Data Engineers

Knowledge graph developers can utilize local, federated and virtualized data both programmatically, and using native and external interfaces.

Execution from Unix terminal (Example):
isql {port} {user} {pwd} EXEC='LOAD dir/kg-demo-all.sql' > kg-demo-all.log &

Programmatic Steps to Create Virtual Triples (RDF Views) over Physical SQL-accessible Tables, plus forward-chained creation of Physical Triples:

(1) Attach All CSVs in designated directory to a virtual SQL table

DROP TABLE Demo.kg.raw;
ATTACH_FROM_CSV_DIR('demo.kg.raw','csv_demo_2',',', '\n', null, 1,vector(2,3,4,5,6,7,8));
GRANT SELECT ON demo.kg.raw TO SPARQL_SELECT;

(2) Create local physical table, which Includes ID hash for “surrogate key”

DROP TABLE demo.kg.physical;
CREATE TABLE demo.kg.physical
AS 
SELECT 
        id,
        first_name,
        last_name,email,
        gender,
        ip_address,
        company,
        job_title,
        md5(CONCAT(first_name,last_name,email,ip_address,gender,company,job_title)) AS hid 
    FROM Demo.kg.raw
    WITH DATA;

ALTER TABLE demo.kg.physical
   MODIFY PRIMARY KEY ("hid");

GRANT SELECT ON demo.kg.physical TO SPARQL_SELECT;

(3) Add R2RML Mapping For Physical Table

SPARQL CLEAR GRAPH <urn:kg:demo:data>;
SPARQL DROP QUAD MAP <urn:kg:demo:mapping> ;
SPARQL CLEAR GRAPH <urn:kg:demo:mapping>;

SPARQL 
prefix rr: <http://www.w3.org/ns/r2rml#> 
prefix DB: <http://demo.openlinksw.com/schemas/DB/> 
prefix db-stat: <http://demo.openlinksw.com/DB/stat#> 
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
prefix void: <http://rdfs.org/ns/void#> 
prefix scovo: <http://purl.org/NET/scovo#> 
prefix aowl: <http://bblfish.net/work/atom-owl/2006-06-06/> 
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix schema: <http:/schema.org/>
prefix ecrm: <http://www.openlinksw.com/ontology/ecrm#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>

INSERT INTO GRAPH <urn:kg:demo:mapping>
    {
        <#TriplesMapPerson> a rr:TriplesMap; rr:logicalTable [ rr:tableSchema "Demo" ; rr:tableOwner "kg" ; rr:tableName "physical" ]; 
        rr:subjectMap [ rr:termType rr:IRI  ; rr:template "http://demo.openlinksw.com/kg-demo#{hid}"; rr:class foaf:Person; rr:graph <http://demo.openlinksw.com/kg-demo#> ];
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant foaf:firstName ] ; rr:objectMap [ rr:column "first_name" ]; ] ;
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant foaf:lastName ] ; rr:objectMap [ rr:column "last_name" ]; ] ;
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant rdfs:label ] ; rr:objectMap [ rr:template "{first_name} {last_name}" ]; ] ;
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant foaf:mbox ] ; rr:objectMap [ rr:termType rr:IRI; rr:template "mailto:{email}" ]; ] ;
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant foaf:gender ] ; rr:objectMap [ rr:column "gender" ]; ] ;
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant DB:ip_address ] ; rr:objectMap [ rr:column "ip_address" ]; ] ;
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant ecrm:hasEmployer ] ; rr:objectMap [ rr:termType rr:IRI; rr:template "http://demo.openlinksw.com/kg-demo#{company}" ]; ] ;
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant ecrm:hasJobTitle ] ; rr:objectMap [ rr:column "job_title" ]; ] ;
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant schema:identifier ] ; rr:objectMap [ rr:column "hid" ]; ] .


        <#TriplesMapCompany> a rr:TriplesMap; rr:logicalTable [ rr:tableSchema "Demo" ; rr:tableOwner "kg" ; rr:tableName "physical" ]; 
        rr:subjectMap [ rr:termType rr:IRI  ; rr:template "http://demo.openlinksw.com/kg-demo#{company}"; rr:class ecrm:Company; rr:graph <http://demo.openlinksw.com/kg-demo#> ];
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant rdfs:label ] ; rr:objectMap [ rr:column "company" ]; ] ;
        rr:predicateObjectMap [ rr:predicateMap [ rr:constant ecrm:hasEmployee ] ; rr:objectMap [ rr:termtype rr:IRI; rr:template "http://demo.openlinksw.com/kg-demo#{hid}" ]; ] .
    };

(4) Create Virtual RDF View from an R2RML Mapping

EXEC ('SPARQL ' || DB.DBA.R2RML_MAKE_QM_FROM_G ('urn:kg:demo:mapping'));	

(5) Create Physical RDF View

RDF_VIEW_SYNC_TO_PHYSICAL ('http://demo.openlinksw.com/kg-demo#', 1, 'urn:kg:demo:data');

Tools for Knowledge Graph Developers and Data Engineers

Part 1b: Creating A Knowledge Graph from Federated SQL RDBMS Tables (Local Virtuoso, SQLServer, MySQL)

Virtual and Physical RDF Views can be generated using the “Linked Data Views” feature

Part 1c: Federated SPARQL Queries (SPARQL-FED)

List the Awards, past partners, and past/current spouses of Robert Downey Jr.

Part 2: Application Developers,Data Consumers and Data Scientists/Analysts

Virtuoso can be accessed by third party tools and consumers via the major open standard protocols for both sending and receiving data:

AWS Jupyter Notebook Example

In this example, we use an Open Source Jupyter Notebook from AWS to post-process data delivered by a combination of an HTTP connection to DBPedia for data consumption, (Virtuoso Instance) and Python for Data Visualization.

Neo4J as a consumer over JDBC

In this example, we use Neo4j’s Cypher to post-process data delivered by a JDBC connection to Virtuoso for Data Visualization.

Tableau as a data consumer and Tool (ODBC or JDBC)

In this example, we use Tableau to post-process data delivered by an ODBC or JDBC connection to Virtuoso for Data Visualization.

Example HTTP(s) consumer utilizing GraphQL

In this example, we demonstrate the use of GraphQL from an HTML-based Single Page Application (SPA) to operate on data retrieved from Virtuoso using SPARQL.

Additional Items

Virtuoso Sponger Middleware Example

(1) Rocky Road Recipe from BBC Good Food

(2) Sponged Page

(3) Sponged schema:Recipe data

(4) Rocky Road Recipe using /describe

Geospatial analytics

Additional Links