Wikidata Ogham Sites — Overview
About this notebook
This notebook is a hands-on introduction to working with knowledge graph data in Python, using Wikidata — the largest openly-licensed general-purpose knowledge graph — as a live data source. It retrieves records about Ogham stones (early medieval inscribed monuments found mainly in Ireland and western Britain) via a SPARQL query and visualises their distribution as bar charts.
It is part of an Open Educational Resource (OER) series on knowledge graphs and linked open data, and is designed to stand on its own: you do not need to have read any other notebook in the series to follow along. A local-Python variant of this notebook is available as wikidata-ogham-sites.ipynb (same content, for use with a regular Jupyter/VS Code setup).
Why this dataset?
Ogham stones are a useful teaching dataset because they are:
- Well-curated on Wikidata, with typed instances, find-spots, and administrative districts linked by dedicated properties.
- Small enough to fit comfortably in memory and render in a browser, but large enough to yield meaningful aggregations.
- Rich in structure: each record participates in several relationships (instance-of, find-spot, county), which makes them a good example for both entity-centric and aggregation queries.
The same pipeline — SPARQL query → DataFrame → visualisation — can be reused for any other knowledge-graph dataset, regardless of domain. Companion notebooks in this series apply it to other endpoints and visualisation types.
Tooling notes
Throughout this notebook we use:
pyodide.http.pyfetchto query Wikidata. Libraries likeSPARQLWrapperorrequestscannot run in the browser because they depend on blocking HTTP; in Pyodide we usepyfetch, which isasync/await-based. The local.ipynbvariant of this notebook usesSPARQLWrapperinstead, which is more convenient when you are not constrained to the browser.pandasto hold the results in a tabular form. Once data is in a DataFrame, standard data-science tooling (grouping, filtering, plotting) applies — regardless of the original source being a graph.matplotlibfor static bar charts. For the map variant of this dataset, see the companion notebookwikidata-ogham-sites-map-live.qmd, which uses Leaflet for interactive geographic visualisation.
On first load, your browser downloads the Python runtime (Pyodide, ~10 MB). Please allow a moment for it to initialise.
Step 1 — Defining the SPARQL query
The query below asks Wikidata for every item that is an instance of Ogham stone (wd:Q2016147), together with its find-spot (linked via wdt:P189), the county in which the find-spot lies, and — optionally — its coordinate location (wdt:P625).
Two notes on query design that generalise to other knowledge-graph queries:
SERVICE wikibase:labelis a Wikidata-specific service that returns human-readable labels for every item variable that also has a?…Labelcompanion in theSELECTclause. It is significantly cheaper than joiningrdfs:labelmanually.OPTIONALis used for coordinates here because not every stone in Wikidata has them; making them mandatory would drop perfectly valid records. In the map variant of this notebook, we invert this choice and make coordinates mandatory — that is the right call there, but not here.
Step 2 — Loading the data
SPARQL results come back as JSON in a format called bindings: a list of dictionaries, one per solution, where each key maps to a {"type": ..., "value": ...} object. We flatten these into plain records and build a DataFrame. This shape — flat records with consistent keys — is almost always what you want when you plan to plot or aggregate.
A single Ogham stone can appear in multiple rows because a stone may be linked to more than one “find-spot” in Wikidata (e.g. both the original location and a current museum). When computing aggregates, remember to use nunique() or drop_duplicates() where appropriate, rather than len(df).
Step 3a — Visualisation: top two find-spots per county
For each Irish county, we identify the two find-spots with the highest number of associated Ogham stones. This highlights concentration patterns — which is often the first thing a domain expert wants to see when exploring a new dataset.
Step 3b — Visualisation: distribution by county
A simpler aggregation: how many Ogham-stone records does each county have? This kind of plot is the sanity-check step of almost any knowledge-graph query — if one county dominates the counts implausibly, that often signals a data-modelling quirk rather than a real-world pattern.
Step 4 — Exploring the data
The cell below is a free playground. Edit the county_filter value, or write your own aggregations — the DataFrame df is available for the rest of the session.
Part of an Open Educational Resource series on knowledge graphs and linked open data, produced in the context of NFDI4Objects.