Skip to content

SpiritSafe Registry Architecture

SpiritSafe is the artifact registry for GKC Entity Profiles and value-list caches.

This section covers the SpiritSafe side of the architecture from registry purpose and artifact layout through profile loading, runtime integration models, and testing strategy.

In this section:

For the authored semantic side of the architecture, see Meta-Wikibase Architecture.

In the infrastructure triad:

  • Data Distillery Wikibase defines semantic foundation.
  • SpiritSafe materializes and versions those semantics as deterministic artifacts.
  • GKC Python package executes curation workflows from SpiritSafe artifacts.

Purpose

SpiritSafe provides deterministic, version-controlled artifacts consumed by gkc without requiring live Wikibase reads at packet assembly time.

SpiritSafe stores the materialized/actionable forms of the core architectural components:

  • Entity Profiles: still/profiles/<QID>.json
  • Entity Statements: embedded in profile artifacts and represented in cache/index metadata
  • Value Lists: query definitions + hydrated cache payloads
  • Curation Packet foundations: profile graph/value-list graph metadata used by packet builders

Repository Shape

  • still/profiles/<QID>.json - JSON Entity Profiles.
  • still/value_lists/queries/<QID>.sparql - value-list query definitions.
  • still/entities/*.json - Wikibase cache substrate.
  • still/value_lists/cache/<QID>.json - hydrated value-list artifacts.
  • config/semantic_anchors.json - semantic name-to-entity lookup artifact.
  • partners/wikimedia_sites.json - partner-facing sitelink source artifact.

Build Pipeline

  1. Refresh still/entities from Wikibase.
  2. Export JSON profiles to still/profiles/<QID>.json.
  3. Export and hydrate value-list queries under still/value_lists/.
  4. Build supporting generated artifacts such as config/semantic_anchors.json and partners/wikimedia_sites.json when needed.

Runtime Consumption Model

gkc consumes:

  • profile definitions from still/profiles/<QID>.json
  • value-list items from still/value_lists/cache/<QID>.json
  • profile metadata embedded in still/profiles/*.json for registry/discovery tooling
  • semantic-anchor data from config/semantic_anchors.json

SpiritSafe also stores the semantic-anchor artifact used by anchor-backed runtime workflows. The semantic-anchor concept itself belongs to the Meta-Wikibase side of the architecture, but the built artifact lives under config/semantic_anchors.json as part of the SpiritSafe materialization layer. See Semantic Anchors.

Packet assembly routes in still_charger.create_curation_packet consume profile JSON directly.

Curation packets themselves are generated by gkc at runtime, not stored as SpiritSafe artifacts. SpiritSafe provides the rule and lookup substrate from which packets are assembled and charged.

Registry Discovery

CLI registry operations and downstream discovery now enumerate still/profiles/*.json directly and read the embedded metadata blocks for labels, descriptions, profile-graph edges, and value-list linkage summaries.

Governance Notes

  • SpiritSafe artifacts are generated outputs and should not be hand-edited.
  • Workflow automation commits artifact updates with [skip ci] suffixes to prevent looped runs.

Theoretical Design Notes

  • Additional registry-optimization artifacts may be introduced later if packet-pipeline performance requires them, but the current runtime contract is profile-document-first.