Simple Virtuoso SQL and Full Text Example

howto
virtuosordbms
sql
fulltext-search
#1

Virtuoso has supported Full Text Search via SQL or SPARQL for many years. Naturally, its use in SPARQL is much more common than its use in SQL.

Here is a simple sequence of commands that you can apply to any Virtuoso instance once you’ve copied the CSV files used in the example to your ~/virtuoso/vad folder.

-- Cleanup

DROP TABLE csv.demo.companies ;
DROP TABLE csv.demo.investments ;
DROP TABLE csv.demo.investors ;
DROP TABLE csv.demo.articles ;

DELETE FROM DB.DBA.csv_load_list ;

-- Register CSVs located in a directory relative to 
-- ~virtuoso/database or $VIRTUOSO/database 

CSV_REGISTER ('../vad', 'companies.csv') ;
CSV_REGISTER ('../vad', 'investments.csv') ;
CSV_REGISTER ('../vad', 'investors.csv') ;
CSV_REGISTER ('../vad', 'articles.csv') ;
-- Verify Registration 

SELECT * FROM  DB.DBA.csv_load_list ;
-- Load Data 

CSV_LOADER_RUN () ;
-- Create "vdb" USER

USER_CREATE('vdb','vdb') ;
-- Grant access to user "vdb" for live demo purposes

GRANT SELECT ON "csv"."demo"."companies" TO "vdb" ;
GRANT SELECT ON "csv"."demo"."investments" TO "vdb" ;
GRANT SELECT ON "csv"."demo"."investors" TO "vdb" ;
GRANT SELECT ON "csv"."demo"."articles" TO "vdb" ;
-- Confirm Data Load

SELECT TOP 10 * FROM "csv"."demo"."companies" ;
SELECT TOP 10 * FROM "csv"."demo"."investments" ;
SELECT TOP 10 * FROM "csv"."demo"."investors" ;
SELECT TOP 10 * FROM "csv"."demo"."articles" ;
-- Full Text Indexing on "csv"."demo"."investors"

CREATE TEXT INDEX ON "csv"."demo"."investors" (overview) ;

SELECT COUNT (*) 
FROM "csv"."demo"."investors" 
WHERE CONTAINS (overview, '"venture capital"')  ;

SELECT blog_url 
FROM "csv"."demo"."investors" 
WHERE CONTAINS (overview, '"venture capital"') AND 
	  blog_url IS NOT NULL  ;

SELECT blog_url, blog_feed_url
FROM "csv"."demo"."investors"
WHERE CONTAINS (overview, '"venture capital"') AND
      blog_url AND blog_feed_url IS NOT NULL ;
-- Full Text Indexing on "csv"."demo"."articles

CREATE TEXT INDEX ON "csv"."demo"."articles" (snippet) ;

Test Queries

SELECT COUNT (*) 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (title, '"Apple"')  ;
SELECT COUNT (*) 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple"')  ;
SELECT COUNT (*) 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (title, '"Apple" OR "Samsung"')  ;
SELECT snippet 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple"') AND 
  snippet IS NOT NULL  ;
SELECT COUNT (*) 
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple" and "Samsung"') AND 
snippet IS NOT NULL  ;

Note: the queries that follow are also associated with live links. To view the documents identified by these links simply authenticate as vdb (for both username and password) when challenged.

-- List Post URL, Title, and Text Snippet for any post
-- where text contains both Apple and Samsung

SELECT url, title, snippet
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple" and "Samsung"') AND 
snippet IS NOT NULL  ;

Live Query Results Link.

Query Results Screenshot

-- List Post URL, Title, and Text Snippet for any post 
-- where text contains all of Apple, Samsung, and Cisco

SELECT url, title, snippet
FROM "csv"."demo"."articles" 
WHERE CONTAINS (snippet, '"Apple" and "Samsung" and "Cisco"') AND 
snippet IS NOT NULL  ;

Live Query Results Link.

Query Results Screenshot

-- List Post URL, Title, and Text Snippet for any post 
-- where text contains either Apple OR Samsung

SELECT url, title, snippet
FROM "csv"."demo"."articles" 
WHERE CONTAINS (title, '"Apple" OR "Samsung"') AND 
snippet IS NOT NULL  ;

Live Query Results Link.

Query Results Screenshot

CSV Documents

Related

0 Likes

Simple Virtuoso SPARQL and RDF Linked Data Views Full Text Example