Speleothem isotope records from a local Turtle file: Antro del Corchia

About this notebook

This notebook reads a local RDF/Turtle export of the δ18O and δ13C observations from Antro del Corchia (SISAL site 145, Apuan Alps, Italy), parses it with rdflib, and plots the isotope records against calendar age, with Marine Isotope Stage (MIS) bands drawn as the climatological backdrop.

Everything runs in your browser via Pyodide — rdflib, pandas, matplotlib, scipy — no Python installation and no SPARQL endpoint. The TTL file sits next to this page and is loaded into Pyodide’s virtual filesystem at startup.

TipSource and attribution

Data: SISALv3Kaushal et al. 2024, Earth Syst. Sci. Data 16, 1933–1963 · database DOI 10.5287/ora-2nanwp4rk. RDF conversion, Savitzky–Golay smoothing, and the MIS-band plotting conventions: the GeoScience-FAIRification-LOD repository by the Research Squirrel Engineers. The TTL snapshot used here was produced by plot_sisal_from_csv.py in that repository, which also ships static .svg/.jpg plots of the records shown below.

Why this dataset?

Corchia is one of the longer and better-dated speleothem records in SISALv3: four separate stalagmites (SISAL entity IDs 665, 667, 668, 670) covering parts of the last glacial cycle and the Holocene (~2.5–140 ka BP), with paired δ18O and δ13C measurements (~1,200 of each). That makes it a rewarding teaching dataset: long enough to cross several glacial–interglacial boundaries, short enough to plot fluently in the browser, and with two proxies side by side so the δ18O (hydroclimate) vs. δ13C (vegetation / soil CO₂) contrast becomes visible.

What you’ll learn

  • How to load a static TTL file into Pyodide via Quarto’s resources: frontmatter and parse it with rdflib
  • How to write SPARQL against an in-memory graph to pull out paired proxy observations and their pre-computed smoothed values
  • How to overlay Marine Isotope Stage bands on a time axis using matplotlib.axhspan — the standard palaeoclimate convention of age increasing downwards on the y-axis

Data-context notes

  • Age convention. geolod:ageKaBP stores calendar age in thousands of years before present (ka BP). The axis is conventionally drawn with the present at the top and deeper time below — ax.invert_yaxis() after setting limits.
  • Pre-computed smoothing. The RDF already carries two smoothed variants per observation — an 11-point rolling median (geolod:smoothedValue_rollingMedian) and a Savitzky–Golay filter (geolod:smoothedValue_savgol, w=11, polyorder=2). We plot the raw values in faded grey and the Savitzky–Golay smoother on top in black, mirroring the EPICA-style layout used by the upstream repository.
  • Four entities, one cave. A single cave can host multiple stalagmites, and Corchia has four: SISAL entity_id 665, 667, 668, 670. We colour them separately so the overlap and complementarity of their age ranges becomes visible. Treating all four as a single undifferentiated series would wash that out.
  • Proxy interpretation (short version). δ18O of speleothem calcite tracks the isotopic composition of drip water, which depends on rainfall source, amount, and temperature. δ13C reflects soil CO₂ (vegetation density, biological activity) and prior calcite precipitation. They are not redundant: δ18O is broadly climate-driven, δ13C is more sensitive to the karst–vegetation system above the cave.

Tooling notes

rdflib is not pre-bundled in Pyodide, so we install it via micropip. Plotting uses the standard matplotlib stack, which is pre-bundled. A full local-runtime companion exists as plot_sisal_from_csv.py in the SISAL FAIRification repository, which is also the script that produced the TTL snapshot used here.

Note

On first load, your browser downloads the Python runtime (Pyodide, ~10 MB) and fetches the ~1.6 MB TTL file into its virtual filesystem. Please allow a moment for both to initialise.

1 Setup, data loading, and SPARQL query

The resources: entry in the frontmatter tells quarto-live to copy sisal_145_corchia_data.ttl into Pyodide’s VFS, where it is readable as an ordinary file at the notebook’s working directory. One cell handles everything from here to a clean DataFrame: install rdflib, parse the graph, run one SPARQL query that pulls both the δ18O and δ13C observations together with their pre-computed Savitzky–Golay smoothed values and the entity_id of the parent speleothem, and project the bindings into pandas. We route the observation-type URI into a short label (d18O / d13C) so downstream plotting can iterate over both proxies cleanly.

2a Sanity check — a simple scatter per proxy

Before styling anything, plot both proxies against age with age increasing downwards. If the distributions look right and the age windows line up with the speleothem entities you expect, the ingest is fine.

2b Isotope records with MIS bands

The classic palaeoclimate layout: age on the y-axis (present at the top), δ18O and δ13C as two panels side by side, speleothem entities coloured separately, and the Marine Isotope Stage chronology drawn as coloured bands behind the data. The Savitzky–Golay smoother cleans up high-frequency noise without smearing glacial–interglacial transitions.

3 Explore

The full DataFrame is in scope. A few starting points for your own experiments:


Part of an Open Educational Resource series on knowledge graphs and linked open data, produced in the context of NFDI4Objects. Data source: Kaushal et al. 2024, SISALv3; RDF conversion and plotting conventions adapted from Research-Squirrel-Engineers/GeoScience-FAIRification-LOD.