SPARQL Query Examples -- DBpedia Exploration

A Practical Guide to Understanding Ontology Structure and Data Patterns in the Wild

DBpedia remains one of the richest publicly available Knowledge Graphs derived from Wikipedia content. Its structure gives a unique window into the shape of real-world data on the Web: entity types, properties, hierarchies, and semantic relationships.

This article explores DBpedia using a sequence of SPARQL queries, each designed to highlight a specific pattern or semantic capability. Every query includes:

  • A clickable link to run it directly against DBpedia.
  • An explanation of what the query uncovers and why it matters.

1. Entity Types and Representative Instances

This query lists classes (types) in DBpedia, shows a sample instance for each, and counts how many entities belong to that type.

Query

SELECT ?entityType (SAMPLE(?entity) AS ?sampleEntity) (COUNT(*) AS ?count)
WHERE {
  ?entity a ?entityType .
}
GROUP BY ?entityType
ORDER BY DESC(?count)

Run it

Type–Counts-and-Samples

Why It’s Useful

This is the fastest way to understand what types DBpedia actually contains and how many instances each type has.

It highlights:

  • Dominant classes (e.g., “Person”, “Place”, “Work”)
  • Niche or sparsely populated types
  • Unexpected type proliferation due to Wikipedia infobox diversity

It also provides a compact sanity check before doing deeper ontology or property exploration.


2. SubProperty/SuperProperty Exploration (Random Representative Start Points)

This query samples commonly used super-properties, selects one representative sub-property for each, and computes the full transitive hierarchy.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT *
WHERE {
  {
    SELECT ?superProperty ?subProperty
    WHERE {
      {
        SELECT (?c AS ?superProperty) (SAMPLE(?a) AS ?subProperty) (COUNT(*) AS ?usageCount)
        WHERE {
          ?a rdfs:subPropertyOf ?c .
          ?a a ?type .
          FILTER (?type IN (owl:ObjectProperty, rdf:Property))
        }
        GROUP BY ?c
        ORDER BY DESC(?usageCount)
        LIMIT 5
      }
    }
  }
  ?subProperty rdfs:subPropertyOf* ?superProperty .
}
LIMIT 100

Run it

Random-Subproperty-Transitive-Closure

Why It’s Useful

This demonstrates:

  • How DBpedia’s property hierarchy is structured
  • Which super-properties dominate usage
  • How transitive closure (*) reveals inherited meaning

This is particularly useful when mapping DBpedia’s ontology to external ontologies or evaluating property alignment for integration tasks.


3. SubProperties Using the {+} Property Path Operator (Strictly Descendant Only)

This query selects the single most reused super-property and retrieves all of its sub-properties at any depth—but only those reachable via at least one rdfs:subPropertyOf relationship.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT (?c AS ?superProperty) (?a AS ?subProperty)
WHERE {
  {
    SELECT ?c
    WHERE {
      ?a rdfs:subPropertyOf ?c .
      ?a a ?type .
      FILTER (?type IN (owl:ObjectProperty, rdf:Property))
    }
    GROUP BY ?c
    ORDER BY DESC(COUNT(?a))
    LIMIT 1
  }
  ?a rdfs:subPropertyOf+ ?c .
}
LIMIT 500

Run it

Subproperty-Strict-Descendants

Why It’s Useful

The + operator ensures that only proper descendants are returned—not the property itself.

This is ideal for:

  • Auditing ontology depth
  • Creating visual property hierarchies
  • Identifying redundant or overly specific properties

4. SubProperties Using the {2} Property Path Operator

This focuses on properties that are exactly two steps below a frequently used super-property.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT (?c AS ?superProperty) (?a AS ?subProperty)
WHERE {
  {
    SELECT ?c
    WHERE {
      ?a rdfs:subPropertyOf ?c .
      ?a a ?type .
      FILTER (?type IN (owl:ObjectProperty, rdf:Property))
    }
    GROUP BY ?c
    ORDER BY DESC(COUNT(?a))
    LIMIT 1
  }
  ?a rdfs:subPropertyOf{2} ?c .
}
LIMIT 500

Run it

Two-Hop-Subproperties

Why It’s Useful

The {2} operator gives you a controlled look at mid-depth ontology structure.

This is helpful for:

  • Ontology debugging
  • Identifying second-order refinements of major properties
  • Extracting property layers for tools that require bounded depth

5. Two-Hop SubProperty Exploration for the Top 10 Super-Properties

This expands the previous pattern to explore multiple major super-properties simultaneously.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT (?c AS ?superProperty) (?a AS ?subProperty)
WHERE {
  {
    SELECT ?c
    WHERE {
      ?a rdfs:subPropertyOf ?c .
      ?a a ?type .
      FILTER (?type IN (owl:ObjectProperty, rdf:Property))
    }
    GROUP BY ?c
    ORDER BY DESC(COUNT(?a))
    LIMIT 10
  }
  ?a rdfs:subPropertyOf{2} ?c .
}
LIMIT 100

Run it

Two-Hop-Subproperties-Top10

Why It’s Useful

Powerful exploration of the DBpedia Knowledge Graph

You can quickly spot:

  • Consistent modeling patterns
  • Inconsistencies across similar property families
  • Opportunities for ontology normalization

6. Property Usage and Dominance

This query counts the usage of every property (predicate) in the entire knowledge graph. It is the most direct way to discover which relationships form the backbone of DBpedia.

Query

SELECT ?p (COUNT(*) AS ?usageCount)
WHERE { ?s ?p ?o }
GROUP BY ?p
ORDER BY DESC (?usageCount)

Run it

Property-Usage-Counts

Why It’s Useful

This query provides a high-level statistical overview of the graph’s structure. It answers the question: “What are the most common facts stored in DBpedia?”

It helps you immediately identify:

  • Core RDF/RDFS properties: rdf:type, rdfs:label, rdfs:comment.
  • Dominant data properties: dbo:wikiPageWikiLink, dct:subject.
  • Metadata vs. factual properties: Distinguishing between properties about an entity (like prov:wasDerivedFrom) and properties stating a fact about the entity (like dbo:birthPlace).
  • The most promising properties to explore in more detail with subsequent queries.

7. Top-5 Property Hierarchies by Usage and Transitive Closure

This advanced query combines statistical analysis with ontology traversal. It first identifies the five most-used properties in the entire graph, calculates their usage count and percentage, and then finds all of their respective sub-properties at any depth.

Query

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?startProperty ?usageCount ?usagePercent ?subProperty ?superProperty
WHERE {

  #####################################################################
  # 1. Determine the 5 most-used properties (global property ranking)
  #####################################################################

  {
    SELECT ?startProperty ?usageCount ?usagePercent
    WHERE {
      # Compute usage count per property
      {
        SELECT ?p (COUNT(*) AS ?usageCount)
        WHERE { ?s ?p ?o }
        GROUP BY ?p
      }

      # Compute percentage of total usage
      {
        SELECT (SUM(?cnt) AS ?totalCount)
        WHERE {
          SELECT (COUNT(*) AS ?cnt)
          WHERE { ?s ?p ?o }
        }
      }

      BIND(?p AS ?startProperty)
      BIND((100 * ?usageCount / ?totalCount) AS ?usagePercent)
    }
    ORDER BY DESC(?usageCount)
    LIMIT 5
  }

  #####################################################################
  # 2. Use the ranked properties as starting points of closure
  #####################################################################

  ?subProperty rdfs:subPropertyOf* ?startProperty .
  BIND(?startProperty AS ?superProperty)
}
LIMIT 200

Run it

Top-5-Property-Hierarchies-by-Usage

Why It’s Useful

This is the ultimate “high-impact” exploration query. It directly connects the statistical backbone of the knowledge graph (the most used properties) with its semantic structure (the property hierarchies).

This allows you to:

  • Prioritize analysis: Immediately focus on the ontologies of the properties that matter most in practice.
  • Understand semantic depth: See if a heavily used property like dct:subject is a standalone predicate or the root of a deeper hierarchy.
  • Discover the “semantic backbone”: The results show the main pillars of the graph (rdf:type, dbo:wikiPageWikiLink, etc.) and the full scaffolding that supports them.
  • Guide data integration: When mapping an external schema to DBpedia, this query tells you exactly which property families are the most important to align with.