Named Entity Extraction and Virtuoso Meta Cartridges

Virtuoso’s Sponger Middleware layer comprises Extractor and Meta Cartridges for driving its Extract, Transform, and Load (ETL) functionality.

What is an Extractor Cartridge?

A Cartridge the extracts content from a given data source, via an associated data access API, and then transforms the extracts into an entity relationship graph represented using a collection of RDF sentences/statements. For example, this is how the content of an HTML document is transformed into a collection of RDF sentences/statements that describe said document.

What is a Meta Cartridge?

A secondary Cartridge that performs additional processing (e.g., entity extraction and LOD Cloud lookups) that’s invoked following completion of all Extractor Cartridge activities. For instance, following transformation of HTML document content into a basic RDF Graph, it passes the objects of annotation relations (e.g., rdfs:comment, dcterms:description, schema:text, schema:description etc.) from the emerging graph on to additional named entity extraction services such as DBpedia Spotlight, BabelNet, Dandelion, and Google Knowledge Graph.

Meta Cartridge Example

Given an HTML document about the titled “Coronavirus Genome Annotation Reveals Amino Acid Differences With Other SARS Viruses” that’s identified by the URL https://www.genomeweb.com/sequencing/coronavirus-genome-annotation-reveals-amino-acid-differences-other-sars-viruses, here’s how the entity extraction services provided by the BabelNet Entity Extraction Meta Cartridge generate a Named Entity Recognition (NER) Graph, using terms from the Natural Language Interchange Format (NLIF) Ontology.

HTML Document before Named Entity Recognition and LOD Cloud Lookups operations by Meta Cartridge.

HTML Document after Named Entity Recognition and LOD Cloud Lookups operations by Meta Cartridge – courtesy of the OpenLink Structured Data Sniffer (OSDS) Browser Extension’s injection of Super Keys (generated by the Meta Cartridge).

Named Entity Reconciliation oriented Entity Relationship Graph generated by BabelNet Meta Cartridge

Sponger /about Service Pages

Here is a sequence of links reflecting directed-graph navigation offered by Entity Relationship Types (Relations) that connect various entities in the graph segment associated with the BabelNet Meta Cartridge.

  1. Graph that associates Named Entities with the source HTML doc

  2. Annotation

  3. Entity Extraction

  4. About Extracted Entity

  5. String Selection

  6. Context

Sponger /describe Service Pages

Here is another sequence of links reflecting directed-graph navigation offered by Entity Relationship Types (Relations) that connect various entities in the graph segment associated with the BabelNet Meta Cartridge.

  1. Graph that associates Named Entities with the source HTML doc

  2. Annotation

  3. Entity Extraction

  4. About Extracted Entity

  5. String Selection

  6. Context

Related

1 Like