Samian ware: potter activity at production centres

This browser-executable notebook joins two SPARQL queries from the NFDI4Objects Knowledge Graph — one on potter–centre assignments with fuzzy confidence scores, the other on production-centre geographies — into a single proportional-symbol map that makes the productive mass of Samian-ware workshops visible at a glance.

Note

On first load, your browser downloads the Python runtime (Pyodide, ~10 MB). Please allow a moment for it to initialise.

Warning

The NFDI4Objects Knowledge Graph is a research prototype. If this notebook fails to load data with a network error, the endpoint may be temporarily unreachable or may not allow cross-origin browser requests from this page’s domain. The local .ipynb companion is not affected by this and is always a reliable fallback.

About this notebook

Samian-ware potters are known from stamps, decorative signatures, and stylistic attribution. Assigning a potter to a production centre is often a judgement call — a stamp may point to one workshop, stylistic analysis to another, and sometimes a potter is recorded as active at several sites. The NFDI4Objects Knowledge Graph captures this uncertainty explicitly through the Academic Meta Tool (AMT) vocabulary: every potter–centre assignment carries an amt:weight, a value in [0, 1] expressing confidence — 1.0 for secure attributions, fractional values (often 0.5) for vague assignments derived from rules such as “worked at kilnsite A or kilnsite B”.

Summing those weights per centre yields a productive-mass score for each workshop: not a potter count, but a confidence-weighted estimate of how much potter activity was concentrated there. This notebook computes that score, joins it to the centre geographies, and plots the result. A companion local notebook, n4okg-samian-potter-activity.ipynb, runs the same pipeline against the full scientific Python stack.

Why this dataset?

Fuzzy-score reasoning is a thread that runs through NFDI4Objects work on archaeological inference. The Samian-ware dataset is an unusually clean example: the same amt:weight mechanism feeds the AMT Python port’s reasoning engine, and the scores here are the concrete output of that engine applied to real potter records. Visualising the weights spatially closes the loop between inference and interpretation.

What you’ll learn

how to join two SPARQL result sets in pandas via a shared IRI column
how SUM(xsd:decimal(?w)) in SPARQL gives a per-group confidence aggregate
how to build a proportional-symbol map with range-labelled legend, avoiding the common pitfall of scaling radius directly (which exaggerates large values)

Data-context notes

query 1 (ProductionCentre + KilnRegion + geometry) has one row per centre, around 100 in total
query 3 (Potter × Centre, weighted) has many thousand rows — one per (potter, centre) pair, aggregated via GROUP BY
scores typically cluster around a few discrete values: 1.0 for secure, 0.5 for two-way vague, 0.33… for three-way, etc.
centres may carry more than one kiln-region label; we keep the first encountered, matching query-1 behaviour
a centre with no potter records has score 0 and is not plotted (inner-join semantics)

Tooling notes

In the browser, both SPARQL calls go through pyodide.http.pyfetch; the join is pure pandas. Mapping uses a hand-rolled Leaflet block returned via _repr_html_. Radius is scaled as √score (area ∝ score) so that a centre with twice the score looks twice as busy, not four times as busy — the standard correction for proportional-symbol maps.

Step 1 — Define the SPARQL queries

Two queries this time. The potter-activity query groups by centre and sums the confidence weights; the centre-geography query is the same one used in the production-centres notebook, reused here so the two notebooks can stand alone.

Step 2 — Load the data

Three dataframes are assembled here: df_assignments with one row per potter–centre pair, df_centres with one row per centre (geo + region), and df_activity — the per-centre aggregate computed from df_assignments and joined to df_centres. Keeping the intermediate dataframes around makes the Step-4 exploration cell more flexible.

Step 3a — Top 15 centres by potter activity (sanity check)

A horizontal bar chart of the 15 most productive centres orders the map’s visual ranking numerically. Bars are coloured by kiln region to match the map palette below, so the two views can be read as one.

Step 3b — Proportional-symbol map

Each active centre is drawn as a circle whose area (not radius) scales with the total potter-activity score. Colour encodes kiln region, so the map simultaneously answers where is the activity concentrated? and which regional tradition is each centre part of?. The legend reports three size ranges; the overlay control embeds colour swatches for each region.

Step 4 — Explore

Three dataframes are in scope — df_assignments (raw potter × centre pairs), df_centres (centre geographies), and df_activity (the join used for the map). The example below lists the potters with the highest-score assignments to a single centre; change the filter to ask your own question.

Part of an Open Educational Resource series on knowledge graphs and linked open data, produced in the context of NFDI4Objects.