Harmonizing Disparate RDBMS Data using Virtuoso's built-in Reasoning and Inference functionality

Given distinct demonstration database schemas associated with several Relational Database Management Systems, including Oracle, SQL Server, MySQL, PostgreSQL, Informix, Virtuoso, and some CSV documents, here is how I use Reasoning and Inference, that’s informed by an Ontology, to harmonize real-world entities (e.g., People and Organizations), across the Relations associated with the aforementioned Data Sources.

  1. Attach Virtuoso to each of the Data Sources via their respective existing ODBC Data Source Names.

  2. Generate RDF Views using Virtuoso’s RDBMS to RDF Views Wizard.

  3. Describe how Entity Relationship Types (Relations) from the various data sources are to be aligned with various Entity Types (Classes) using a specific Database Document (e.g., a Named Graph identified by the IRI <urn:rdbms:inference:reasoning:demo:1>).

    SPARQL
    INSERT
       {
          GRAPH <urn:rdbms:inference:reasoning:demo:1>
             {
                ### Organization Class Instances (i.e. Organizations) ###
    
                ## Class derived from "mssql"."northwind"."customers" Table
                <http://demo.openlinksw.com/schemas/SQLServer/Customers>
                   rdfs:subClassOf   foaf:Organization   .
    
                ## Class derived from "csv"."northwind"."customers" Table
                <http://demo.openlinksw.com/schemas/csv/customers> 
                   rdfs:subClassOf   foaf:Organization   .
    
                ## Class derived from "postgres12"."northwind"."suppliers" Table
                <http://demo.openlinksw.com/schemas/postgres12/suppliers>  
                   rdfs:subClassOf   foaf:Organization   .
    
                ## Class derived from "mysql5"."northwind"."suppliers" Table
                <http://demo.openlinksw.com/schemas/mysql5/suppliers>  
                   rdfs:subClassOf   foaf:Organization   .
    
                ### Person Class Instances (i.e. People) ###
    
                ## Class derived from "hr"."ora18"."EMPLOYEES" Table
                <http://demo.openlinksw.com/schemas/OracleHR/EMPLOYEES>
                   rdfs:subClassOf   foaf:Person   .
    
               ## Class derived from "informix"."stores"."customer" Table
               <http://demo.openlinksw.com/schemas/informix/customer>
                   rdfs:subClassOf   foaf:Person   .
    
               ## Class derived from Virtuoso "School"."DBA"."Person" Table
               <http://demo.openlinksw.com/schemas/School/Person>
                  rdfs:subClassOf   foaf:Person   .
             }
       } ;
    
  4. Add terms from the RDFS Schema Vocabulary to the same Named Graph which ultimately enables Virtuoso to understand the nature of Relationship Types such as rdfs:subClassOf.

    SPARQL 
    ADD <http://www.w3.org/2000/01/rdf-schema#> 
    TO <urn:rdbms:inference:reasoning:demo:1> ;
    
  5. Also include Relationships that are missing from RDFS Schema, e.g., indicating that rdfs:subClassOf is Transitive in nature, i.e., the relationship between classes and sub-classes in a class hierarchy are transferable from higher to lower levels of the hierarchy (like inheritance).

    SPARQL
    
    INSERT { GRAPH <urn:rdbms:inference:reasoning:demo:1>
               {  rdfs:subClassOf   a   owl:TransitiveProperty   }
    } ;
    
  6. Use the Entity Relationship Descriptions in the Named Graph <urn:rdbms:inference:reasoning:demo:1> as the basis for a built-in Virtuoso Inference Rule to be invoked when Reasoning and Inference is required.

    RDFS_RULE_SET
      ( 'sql-vdb-data-lake' ,
        'urn:rdbms:inference:reasoning:demo:1'
      ) ;
    
  7. Verify that the Built-In Inference Rule has been successfully generated.

    SELECT * 
    FROM sys_rdf_schema 
    WHERE RS_NAME = 'sql-vdb-data-lake' ;
    

Data Virtualization using Reasoning and Inference Verification

Entity Types

The following links identify faceted browsing pages comprising the description of Entity Types derived from Relations represented as N-Tuples (Tables).

Organizations
People

SPARQL Queries

Each of the following test queries conditionally applies reasoning and inference based on the built-in inference rules created earlier in this post. In all cases, there is a query that operates with or without the the effects of a built-in inference rule that’s informed by an associated Ontology.

Test 1 with Inference Enabled

DEFINE input:inference 'sql-vdb-data-lake' 

SELECT  (COUNT(*) AS ?count)  
FROM <urn:demo.openlinksw.com:OracleHR> 
FROM <urn:demo.openlinksw.com:informix> 
FROM <urn:demo.openlinksw.com:School> 
FROM <urn:demo.openlinksw.com:SQLServer> 
FROM <urn:demo.openlinksw.com:csv> 
FROM <urn:demo.openlinksw.com:postgres12>
FROM <urn:demo.openlinksw.com:mysql5> 
WHERE {
        ?s a foaf:Person .
      }  

Test 1 with Inference Disabled

# DEFINE input:inference 'sql-vdb-data-lake' 

SELECT  (COUNT(*) AS ?count)  
FROM <urn:demo.openlinksw.com:OracleHR> 
FROM <urn:demo.openlinksw.com:informix> 
FROM <urn:demo.openlinksw.com:School> 
FROM <urn:demo.openlinksw.com:SQLServer> 
FROM <urn:demo.openlinksw.com:csv> 
WHERE {
        ?s a foaf:Person .
      }  

Test 2 with Inference Enabled

DEFINE input:inference 'sql-vdb-data-lake' 

SELECT  (COUNT(*) AS ?count)  
FROM <urn:demo.openlinksw.com:OracleHR> 
FROM <urn:demo.openlinksw.com:informix> 
FROM <urn:demo.openlinksw.com:SQLServer> 
FROM <urn:demo.openlinksw.com:School> 
FROM <urn:demo.openlinksw.com:csv> 
FROM <urn:demo.openlinksw.com:postgres12>
FROM <urn:demo.openlinksw.com:mysql5> 
WHERE {
        ?s a foaf:Organization .
      }  

Test 2 with Inference Disabled

# DEFINE input:inference 'sql-vdb-data-lake' 

SELECT  (COUNT(*) AS ?count)  
FROM <urn:demo.openlinksw.com:OracleHR> 
FROM <urn:demo.openlinksw.com:informix> 
FROM <urn:demo.openlinksw.com:SQLServer> 
FROM  <urn:demo.openlinksw.com:School>  
FROM <urn:demo.openlinksw.com:csv> 
FROM <urn:demo.openlinksw.com:postgres12>
FROM <urn:demo.openlinksw.com:mysql5> 
WHERE {
        ?s a foaf:Organization .
      }  

Test 3 with Inference Enabled

DEFINE input:inference 'sql-vdb-data-lake' 

SELECT DISTINCT ?s as ?organization
FROM NAMED <urn:demo.openlinksw.com:OracleHR> 
FROM NAMED <urn:demo.openlinksw.com:informix> 
FROM NAMED <urn:demo.openlinksw.com:SQLServer> 
FROM NAMED <urn:demo.openlinksw.com:School> 
FROM NAMED <urn:demo.openlinksw.com:postgres12> 
FROM NAMED <urn:demo.openlinksw.com:mysql5> 
WHERE {
        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:OracleHR>  {?s a foaf:Organization . } } LIMIT 1
         } 

		 UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:informix> {?s a foaf:Organization . } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:SQLServer> {?s a foaf:Organization . } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:School> {?s a foaf:Organization . } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:mysql5> {?s a foaf:Organization . } } LIMIT 1
        } 

        UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:postgres12> {?s a foaf:Organization . } } LIMIT 1
        } 
      }  

Test 3 with Inference Disabled

# DEFINE input:inference 'sql-vdb-data-lake' 

SELECT DISTINCT ?s as ?organization
FROM NAMED <urn:demo.openlinksw.com:OracleHR> 
FROM NAMED <urn:demo.openlinksw.com:informix> 
FROM NAMED <urn:demo.openlinksw.com:SQLServer> 
FROM NAMED <urn:demo.openlinksw.com:School> 
FROM NAMED <urn:demo.openlinksw.com:postgres12> 
FROM NAMED <urn:demo.openlinksw.com:mysql5> 
WHERE {
        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:OracleHR>  {?s a foaf:Organization . } } LIMIT 1
         } 

		 UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:informix> {?s a foaf:Organization . } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:SQLServer> {?s a foaf:Organization . } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:School> {?s a foaf:Organization . } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:mysql5> {?s a foaf:Organization . } } LIMIT 1
        } 

        UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:postgres12> {?s a foaf:Organization . } } LIMIT 1
        } 
      } 

Test 4 with Inference Enabled

DEFINE input:inference 'sql-vdb-data-lake' 

SELECT DISTINCT ?s 
FROM NAMED <urn:demo.openlinksw.com:OracleHR> 
FROM NAMED <urn:demo.openlinksw.com:informix> 
FROM NAMED <urn:demo.openlinksw.com:SQLServer> 
FROM NAMED <urn:demo.openlinksw.com:School> 
FROM NAMED <urn:demo.openlinksw.com:csv> 
WHERE {
        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:OracleHR> {?s a foaf:Person .  } } LIMIT 1
         } 

		 UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:informix> {?s a foaf:Person .  } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:SQLServer> {?s a foaf:Person .  } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:School> {?s a foaf:Person .  } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:csv> {?s a foaf:Person .  } } LIMIT 1
        } 
      }  

Test 4 Inference Disabled

# DEFINE input:inference 'sql-vdb-data-lake' 

SELECT DISTINCT ?s 
FROM NAMED <urn:demo.openlinksw.com:OracleHR> 
FROM NAMED <urn:demo.openlinksw.com:informix> 
FROM NAMED <urn:demo.openlinksw.com:SQLServer> 
FROM NAMED <urn:demo.openlinksw.com:School> 
FROM NAMED <urn:demo.openlinksw.com:csv> 
WHERE {
        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:OracleHR> {?s a foaf:Person .  } } LIMIT 1
         } 

		 UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:informix> {?s a foaf:Person .  } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:SQLServer> {?s a foaf:Person .  } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:School> {?s a foaf:Person .  } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:prestoDB> {?s a foaf:Person .  } } LIMIT 1
        } 

		UNION

        { 
         SELECT DISTINCT ?s { GRAPH <urn:demo.openlinksw.com:csv> {?s a foaf:Person .  } } LIMIT 1
        } 
      }  

Related

1 Like