Holy Wells in Ireland — a Wikidata exploration

This notebook queries Wikidata for Irish holy wells and plots them on two maps: a marker map with images and cross-references, and a hex-binned density grid showing where wells cluster. The data model is a community effort of the WikiProject HolyWells, which links each well to its photograph on Wikimedia Commons and to its corresponding feature in OpenStreetMap.

A local Jupyter companion (holy-wells.ipynb) runs the same pipeline with the full scientific Python stack.

Note

On first load, your browser downloads the Python runtime (Pyodide, ~10 MB). Please allow a moment for it to initialise.

About this notebook

Why this dataset?

Irish holy wells are a near-ideal teaching dataset for linked open data:

Well-curated and citizen-scientist-driven — the WikiProject has a consistent modelling schema (P31 = holy well + holy well semantic concept), so a short query returns a clean, meaningful slice of Wikidata.
Cross-referenced across three open data hubs — each well is typically linked to Wikimedia Commons (image), OpenStreetMap (node or way), and sometimes to the Irish Sites and Monuments Record. This shows Linked Open Data doing its actual job: joining independently maintained datasets through shared identifiers.
Geographically clustered — the Kilkenny concentration is a real-world feature (local mapping effort, not a coverage bias in the sense of a query artefact), which makes spatial aggregation didactically meaningful.

Data-context notes

A few specifics about this particular query result worth flagging:

The query returns one row per image, not one row per well. A well with three photographs produces three rows — so the first step after loading is to deduplicate on the Wikidata item URI.
All three enrichment properties (P18 image, P625 coordinates, P11693 OSM ID) are wrapped in OPTIONAL. Completeness varies: almost every well has coordinates, many have an OSM link, a substantial minority have a Commons image. The completeness analysis in Step 4 is part of the point of this notebook.
Coordinates are returned as GeoSPARQL WKT literals in the form Point(lon lat) — note the longitude-first order, which differs from how mapping libraries expect inputs.

Tooling notes

In the browser, HTTP libraries like requests and SPARQLWrapper aren’t available because Pyodide runs under a Web Worker with no network stack of its own. We use pyodide.http.pyfetch — the browser’s own fetch API exposed to Python. Mapping uses Leaflet directly via an HTML cell output, because folium writes files to disk and is awkward to embed inline. The local companion notebook uses SPARQLWrapper and folium and is otherwise identical in structure.

Step 1 — Define the SPARQL query

The P31 (instance of) filter uses two values joined by a comma, which is SPARQL shorthand for both — a well must be typed as the Wikidata class Q1371047 (holy well) and the project’s controlled concept Q126443332 (holy well semantic concept). The two-class pattern is how the WikiProject distinguishes its curated items from uncurated ones elsewhere in Wikidata.

Step 2 — Load the data

Two things happen here. First, we parse the WKT coordinate literal with a case-insensitive regex — Wikidata writes Point(lon lat) with a capital P, other endpoints use POINT(...), and a defensive parser handles both without per-endpoint code. Second, we group by the Wikidata item URI so each well appears once, keeping the first image and OSM ID found and the count of images as metadata.

Step 3 — Visualise

Step 3a — Marker map with popups

Each well becomes a clickable circle marker. The popup shows the well’s name, a thumbnail of its Wikimedia Commons image (if one exists), and hyperlinks to the Wikidata item and the OpenStreetMap feature. This is the shape that Linked Open Data takes when it reaches the reader: a single data point with navigable attachments to multiple independently maintained open hubs.

We build the map by emitting an HTML string and letting quarto-live insert it into the page. The Python side never touches the DOM — it can’t, as Pyodide runs in a Web Worker.

Step 3b — Hex-binned density grid

The marker map answers where is each well, but not where do wells cluster. For that, we bin the points into a hexagonal grid and colour each cell by count. Hex bins are built in pure Python — no Leaflet plugin needed, no canvas overlays to refresh on fullscreen. Each hex is a plain L.polygon object, which survives container resizing without any extra plumbing.

The projection step (y = lat / cos(φ₀)) makes hexagons look regular at the data’s reference latitude without requiring pyproj (unavailable in Pyodide’s default package set).

Step 4 — Explore

The DataFrame df stays in scope — feel free to modify the cells below, add new ones, or start queries of your own. Two starting points:

Completeness of cross-references

How many wells are linked to each of the three open hubs? This is the kind of question a data steward would ask before writing a paper about the dataset: where are the gaps, what does “complete” mean here, and is the completeness evenly distributed?

Top image contributors

Which wells are the most thoroughly photographed? Multiple photos on Commons usually mean the well is locally significant, easily accessible, or has been the target of a focused documentation effort. Counting images per well is a cheap proxy for how much attention a citizen-science community has paid to this object.

Part of an Open Educational Resource series on knowledge graphs and linked open data, produced in the context of NFDI4Objects. Data: Wikidata WikiProject HolyWells, CC0. Tiles: OpenStreetMap contributors, Esri. Images: Wikimedia Commons contributors.