Simple Virtuoso SQL and Full Text Example

Virtuoso has supported Full Text Search via SQL or SPARQL for many years. Naturally, its use in SPARQL is much more common than its use in SQL.

Here is a simple sequence of commands that you can apply to any Virtuoso instance once you’ve copied the CSV files used in the example to your ~/virtuoso/vad folder.

-- Cleanup

DROP TABLE csv.demo.companies ;
DROP TABLE csv.demo.investments ;
DROP TABLE csv.demo.investors ;
DROP TABLE csv.demo.articles ;

DELETE FROM DB.DBA.csv_load_list ;

-- Register CSVs located in a directory relative to 
-- ~virtuoso/database or $VIRTUOSO/database 

CSV_REGISTER ('../vad', 'companies.csv') ;
CSV_REGISTER ('../vad', 'investments.csv') ;
CSV_REGISTER ('../vad', 'investors.csv') ;
CSV_REGISTER ('../vad', 'articles.csv') ;
-- Verify Registration 

SELECT * FROM  DB.DBA.csv_load_list ;
-- Load Data 

CSV_LOADER_RUN () ;
-- Create "vdb" USER

USER_CREATE('vdb','vdb') ;
-- Grant access to user "vdb" for live demo purposes

GRANT SELECT ON "csv"."demo"."companies" TO "vdb" ;
GRANT SELECT ON "csv"."demo"."investments" TO "vdb" ;
GRANT SELECT ON "csv"."demo"."investors" TO "vdb" ;
GRANT SELECT ON "csv"."demo"."articles" TO "vdb" ;
-- Confirm Data Load

SELECT TOP 10 * FROM "csv"."demo"."companies" ;
SELECT TOP 10 * FROM "csv"."demo"."investments" ;
SELECT TOP 10 * FROM "csv"."demo"."investors" ;
SELECT TOP 10 * FROM "csv"."demo"."articles" ;
-- Full Text Indexing on "csv"."demo"."investors"

CREATE TEXT INDEX ON "csv"."demo"."investors" (overview) ;

SELECT COUNT (*) 
FROM "csv"."demo"."investors" 
WHERE CONTAINS (overview, '"venture capital"')  ;

SELECT blog_url 
FROM "csv"."demo"."investors" 
WHERE CONTAINS (overview, '"venture capital"') AND 
	  blog_url IS NOT NULL  ;

SELECT blog_url, blog_feed_url
FROM "csv"."demo"."investors"
WHERE CONTAINS (overview, '"venture capital"') AND
      blog_url AND blog_feed_url IS NOT NULL ;
-- Full Text Indexing on "csv"."demo"."articles

CREATE TEXT INDEX ON "csv"."demo"."articles" (snippet) ;

Test Queries

SELECT COUNT (*) 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (title, '"Apple"')  ;
SELECT COUNT (*) 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple"')  ;
SELECT COUNT (*) 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (title, '"Apple" OR "Samsung"')  ;
SELECT snippet 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple"') AND 
  snippet IS NOT NULL  ;
SELECT COUNT (*) 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple" and "Samsung"') AND 
snippet IS NOT NULL  ;

Note: the queries that follow are also associated with live links. To view the documents identified by these links simply authenticate as vdb (for both username and password) when challenged.

-- List Post URL, Title, and Text Snippet for any post
-- where text contains both Apple and Samsung

SELECT url, title, snippet
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple" and "Samsung"') AND 
snippet IS NOT NULL  ;

Live Query Results Link.

Query Results Screenshot

-- List Post URL, Title, and Text Snippet for any post 
-- where text contains all of Apple, Samsung, and Cisco

SELECT url, title, snippet
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple" and "Samsung" and "Cisco"') AND 
snippet IS NOT NULL  ;

Live Query Results Link.

Query Results Screenshot

-- List Post URL, Title, and Text Snippet for any post 
-- where text contains either Apple OR Samsung

SELECT url, title, snippet
FROM "csv"."demo"."articles" 
WHERE CONTAINS (title, '"Apple" OR "Samsung"') AND 
snippet IS NOT NULL  ;

Live Query Results Link.

Query Results Screenshot

CSV Documents

Related