NFDI4Objects KG — Ogham Sites by County

About this notebook

This notebook introduces the NFDI4Objects Knowledge Graph (N4O KG) — a research-oriented triple store that integrates data about material cultural heritage from many partner repositories across Germany. We query it with SPARQL and count how many Ogham stones are known from each Irish county, then visualise the distribution as a pie chart and scatter plot.

It is part of an Open Educational Resource (OER) series on knowledge graphs and linked open data. It is designed to stand on its own: you do not need to have read any other notebook in the series to follow along. A local-Python variant of this notebook is available as n4okg-ogham-sites-county.ipynb (same content, for use with a regular Jupyter/VS Code setup).

Why this dataset?

The Ogham corpus is one of several datasets hosted as a named collection inside the N4O KG (here: collection/9). It is useful for teaching because:

The ontology is compact and domain-specific (ontology.ogham.link), so queries read almost like plain English once you know the three or four relevant classes (OghamSite, OghamStone_CIIC, County).
It exposes aggregation patterns well: counting stones per county requires a GROUP BY with a COUNT(DISTINCT …) — the most common workhorse pattern in SPARQL analytics.
It is small (around two dozen counties, a few hundred stones), so the whole dataset fits in memory and renders in a browser without fuss.

Data-context notes

N4O KG data is contributed by different research projects under shared infrastructure; every project brings its own ontology. The Ogham ontology (oghamonto:) is separate from the NFDI4Objects core vocabulary (n4o:). This is normal for domain-rich knowledge graphs and one of the reasons SPARQL is expressive: you can mix vocabularies freely in a single query.

Tooling notes

pyodide.http.pyfetch queries the N4O KG over HTTP. Libraries like SPARQLWrapper depend on blocking HTTP and so do not work in the browser; pyfetch is the browser-compatible equivalent. The local .ipynb variant uses SPARQLWrapper, which is more convenient when you are not browser-constrained.
pandas holds the results, so all subsequent aggregation and plotting uses the standard data-science toolchain.
matplotlib for static visualisations. For maps of the same dataset, see the companion notebook n4okg-ogham-sites-map-live.qmd.

Note

On first load, your browser downloads the Python runtime (Pyodide, ~10 MB). Please allow a moment for it to initialise.

Warning

The N4O KG is a research prototype. If this notebook fails to load data with a network error, the endpoint may be temporarily unreachable or may not allow cross-origin browser requests from this page’s domain. The local .ipynb companion is not affected by this and is always a reliable fallback.

Step 1 — Defining the SPARQL query

The query asks for every instance of oghamonto:OghamSite, looks up the County it lies in (via oghamonto:within), and counts how many catalogued Ogham stones (oghamonto:OghamStone_CIIC) have been disclosed at that site. We group by county.

Two observations on query design that generalise beyond this dataset:

COUNT(DISTINCT ?stone) — not just COUNT(?stone). A stone could in principle be linked through more than one path; DISTINCT makes sure we count each stone once.
Full predicate URIs for rdf:type: we spell <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> rather than using SPARQL’s a shorthand here. Both work; the full URI is sometimes preferred in educational contexts because it exposes the underlying mechanics.

Step 2 — Loading the data

SPARQL results come back as JSON in a format called bindings: a list of dictionaries, one per solution, where each key maps to a {"type": ..., "value": ...} object. We flatten these into plain records and build a DataFrame.

Step 3 — Visualising the distribution

Pie chart with an “Other” bucket

A pie chart is quick to read for a dozen or so categories but breaks down when many slices are tiny. We group every county that contributes less than 3 % of the total into a combined “Other” wedge, and label each slice with both its percentage and the raw stone count.

Scatter plot

A scatter plot per county makes the long tail of low-count counties easier to see than the pie chart does. We show the raw count with the county label annotated.

Step 4 — Exploring the data

The cells below are a free playground — try answering questions of your own by modifying the DataFrame or the SPARQL query.

This notebook is part of the Open Educational Resources of NFDI4Objects.