HowTo: Fine-tuning ChatGPT using SQL

Situation Analysis

The GPT family of Large Language Models (LLMs) released by OpenAI include an ability to be fine-tuned using declarative query language templates. This implies an ability to mesh and utilize the combined benefits of natural language processing and declarative querying of structured data, i.e., the best of both worlds.

Reaching the aforementioned nirvana requires the following capabilities:

  • An LLM that can generate clickable hyperlinks that denote query expressions and solution pages.
  • A DBMS platform with native HTTP support that’s capable of denoting query solutions and actual structured data (e.g., relational sets) using hyperlinks that function as super-keys.

In this post, I showcase how unique functionality of our Virtuoso platform enables SQL-based fine-tuning of the GPT LLM using following sequence:

  1. Describe SPASQL, a SQL and SPARQL query language hybrid, to ChatGPT
  2. Create a sample SPARQL-based SPASQL query, using natural language, to verify LLM understanding
  3. Create a SQL-based fine-tuning template
  4. Create a sample SQL-based SPAQL query, using natural language, to verify successful fine-tuning

Transcript

Describing SPASQL

Q: Virtuoso SPASQL enables execution of SPARQL Queries from within Virtuoso’s SQL query processor. You can even create clickable links that denote query It courtesy of the following link construction rules:

  1. Treat SPARQL Query like a VIEW usable in the SQL FROM clause placed in parenthesis and associated with an alias
  2. Start each SPARQL query in the parenthesized FROM CLAUSE with the ‘SPARQL’ keyword
  3. SELECT list items are qualified using the pattern: alias.sparql-variable
  4. If a pure SQL query is requested, qualify objects using a database_qualifier.schema.database_object naming scheme where a double-quote is used as the identifier quote character
  5. Be it a SPARQL or SQL query, use the query text determined from a prompt as values of for “sql” key in the JSON structure that follows:
    {
    “v”: 1,
    “url”: “/XMLA”,
    “dsn”: “DSN=Local_Instance”,
    “uid”: “demo”,
    “pwd”: “demo”,
    “path”: null,
    “tab”: “exec”,
    “idx”: null,
    “fkey”: null,
    “ref”: null,
    “exec”: {
    “sql”: “SELECT ProductID, ProductName FROM Demo…Products WHERE CategoryID = 1 \n”
    }
    }
  6. Set the Virtuoso SPASQL Query Service endpoint to the default URL: SPASQL Query Builder, or one provided in a prompt
  7. Create a encoded Virtuoso SPASQL Query Service link (URL) using in the prior step JSON structure as the value for the link’s ?permlink_e parameter

Based on the above, generate an clickable Virtuoso SPASQL Query link associated with the text: SPASQL Query, using the following guidelines:

  1. SPARQL Query is a DBpedia SPARQL-FED Query comprising a single triple in the body e.g., ?city a <http: //dbpedia.org/ontology/City> ; and LIMIT set to 20
  2. Using markdown, create a properly encoded Virtuoso SPASQL Query Service link using JSON structure as the value for the ?permlink_e parameter

A:
Your guidelines essentially outline the process to generate a URL-encoded SPASQL query for Virtuoso’s SPARQL endpoint. Given the specific SPARQL query provided (?city a <http://dbpedia.org/ontology/City> . LIMIT 20), we can use it in the JSON structure and then URL-encode the entire structure.

First, we embed the SPARQL query into the JSON structure:

{
    "v": 1,
    "url": "/XMLA",
    "dsn": "DSN=Local_Instance",
    "uid": "demo",
    "pwd": "demo",
    "path": null,
    "tab": "exec",
    "idx": null,
    "fkey": null,
    "ref": null,
    "exec": {
        "sql": "SPARQL SELECT ?city WHERE { ?city a <http://dbpedia.org/ontology/City> } LIMIT 20"
    }
}

Next, URL-encode this JSON structure to use it as a value for the ?permlink_e parameter:

Encoded JSON (truncated for readability):

%7B%22v%22%3A1%2C%22url%22%3A%22%2FXMLA%22%2C%22dsn%22%3A%22DSN%3DLocal_Instance%22%2C%22uid%22%3A%22demo%22%2C%22pwd%22%3A%22demo%22%2C%22path%22%3Anull%2C%22tab%22%3A%22exec%22%2C%22idx%22%3Anull%2C%22fkey%22%3Anull%2C%22ref%22%3Anull%2C%22exec%22%3A%7B%22sql%22%3A%22SPARQL%20SELECT%20%3Fcity%20WHERE%20%7B%20%3Fcity%20a%20%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FCity%3E%20%7D%20LIMIT%2020%22%7D%7D

Finally, you incorporate it into the Virtuoso SPASQL Query Service link:

<a href="https://demo.openlinksw.com/spasqlqb/?permlink_e=%7B%22v%22%3A1%2C%22url%22%3A%22%2FXMLA%22%2C%22dsn%22%3A%22DSN%3DLocal_Instance%22%2C%22uid%22%3A%22demo%22%2C%22pwd%22%3A%22demo%22%2C%22path%22%3Anull%2C%22tab%22%3A%22exec%22%2C%22idx%22%3Anull%2C%22fkey%22%3Anull%2C%22ref%22%3Anull%2C%22exec%22%3A%7B%22sql%22%3A%22SPARQL%20SELECT%20%3Fcity%20WHERE%20%7B%20%3Fcity%20a%20%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2FCity%3E%20%7D

%20LIMIT%2020%22%7D%7D">SPASQL Query</a>

Now, the above HTML creates a clickable link with the text “SPASQL Query” that directs to the Virtuoso SPASQL Query Service endpoint, running the specified SPARQL query.

Please ensure to replace the demo credentials and connection string with your actual values before use.

Also, remember that URL encoding can vary across different platforms or programming languages, so the provided encoded URL is based on a common standard and may need slight adjustments depending on your specific implementation.

Moreover, remember that this query will only work if the Virtuoso server is configured to allow SPASQL queries and the user has appropriate permissions.

Q: Create the clickable link I requested. I don’t want to copy and paste anything.
Q: You can use markdown to produce the clickable link I requested.

A:
I apologize for the confusion earlier. Indeed, I can create a clickable link using Markdown. Here is the clickable link as requested:

SPASQL Query

Clicking on “SPASQL Query” will execute the provided SPARQL query on the Virtuoso SPASQL Query Service endpoint. Please remember to replace the demo credentials and connection string with your actual values before use.

SPASQL based SPARQL Query & Query Solution Screenshot

SQL Fine-Tuning Tenplate Description

Q: Create the following templates:
Find the first 10 Products from my <A> database using spasql query endpoint <Q>;SELECT TOP 10 ProductID, ProductName FROM Demo..Products ;

Find all Products from my <A> database using spasql query endpoint <Q>;SELECT ProductID, ProductName FROM Demo..Products ;

Find the first 10 Customers from my <A> database using spasql query endpoint <Q>;SELECT TOP 10 CustomerID, CompanyName, ContactName FROM Demo..Customers ;

Find all Customers from my <A> database using spasql query endpoint <Q>;SELECT CustomerID, CompanyName, ContactName FROM Demo..Customers ;

Find the first 10 Customer Orders from my <A> database using spasql query endpoint <Q>;SELECT TOP 10 c.CustomerID,c.CompanyName,o.OrderID,o.OrderDate FROM Demo..Customers c INNER JOIN Demo..Orders o ON c.CustomerID = o.CustomerID ;

Find the Customer Orders from my <A> database using spasql query endpoint <Q>;SELECT c.CustomerID,c.CompanyName,o.OrderID,o.OrderDate FROM Demo..Customers c INNER JOIN Demo..Orders o ON c.CustomerID = o.CustomerID ;

Find all Orders from my <A> database using spasql query endpoint <Q>;SELECT OrderID, OrderDate FROM Orders..Customers ;

Q: Create a clickable Virtuoso SPASQL link, using Markdown (without a code-block), for the following: Find the first 10 products from my Demo database.

A:
Here is the clickable link using Markdown, with the query for the first 10 products from your Demo database:

[Find the first 10 Products](https://demo.openlinksw.com/spasqlqb/?permlink_e=%7B%22v%22%3A1%2C%22url%22%3A%22%2FXMLA%22%2C%22dsn%22%3A%22DSN%3DLocal_Instance%22%2C%22uid%22%3A%22demo%22%2C%22pwd%22%3A%22demo%22%2C%22path%22%3Anull%2C%22tab%22%3A%22exec%22%2C%22idx%22%3Anull%2C%22fkey%22%3Anull%2C%22ref%22%3Anull%2C%22exec%22%3A%7B%22sql%22%3A%22SELECT%20TOP%2010%20ProductID%2C%20ProductName%20FROM%20Demo..Products%22%7D%7D)

Please remember to replace the demo credentials and connection string with your actual values before use. The query within this link retrieves the top 10 products from the Products table in your Demo database.

SPASQL based SQL Query & Query Solution Screenshot

Q: Redo, but without a code-block.

A:
Apologies for the confusion earlier. Here is the clickable link without using a code block:

Find the first 10 Products

Clicking on “Find the first 10 Products” will execute the provided SPARQL query on the Virtuoso SPASQL Query Service endpoint.

Related