Wikibase Datatypes
Wikibase datatypes are the first governance layer for statement values in gkc.
They are not the full validation story, but they are the base contract on which later rules depend.
The project treats Wikidata as the primary Commons Partner where curated data and metadata are expected to land first, with distribution outward to other partner systems after that. Because of that, gkc needs a precise and stable understanding of the datatype vocabulary used by Wikibase statements.
Why This Matters
The fundamental information unit in gkc is the statement.
A statement includes everything the Wikibase statement architecture supports.
- mainsnak value
- qualifiers
- references
- rank and related statement-level semantics
Every value inside those structures begins with a Wikibase datatype contract.
That base datatype does not answer every validation question, but it does define the lowest common layer of meaning for a value. Higher-order constraints in gkc build on top of that foundation.
Layered Governance
Datatype governance in gkc is layered.
-
The Wikibase datatype layer defines the primitive datatype contract such as
wikibase-item,url,time, ormonolingualtext. -
The raw claim serialization layer defines how values appear in raw Wikibase JSON, including
datavalue.typedetails such aswikibase-entityid,time,quantity, orstring. -
The statement-level rule layer adds constraints attached to a particular statement concept, such as whether a value must resolve to an item, whether it must come from a specific value list, or whether it must match a fixed value.
-
The profile-level rule layer adds scoped rules for how a statement behaves within a particular entity profile, including cardinality, reference expectations, qualifiers, and workflow-specific conformance behavior.
The package-owned datatype registry only owns the first two layers. It does not encode full validation or workflow behavior.
Package-Owned Registry
gkc ships a package-owned Wikibase datatype registry under gkc/registry/.
The registry exists to give the codebase one stable internal reference for:
- canonical runtime datatype tokens
- authoritative Wikibase ontology URI mappings
- raw
datavalue.typeexpectations used in Wikibase JSON processing
This registry is intentionally small. It is not a behavior engine and it is not meant to replace profile-driven validation.
What Belongs In The Registry
For each canonical datatype token, the current registry stores:
ontology_uridatavalue_type- optional
entity_value_kindwhen the datatype resolves to a specific Wikibase entity kind such asitem
This is enough to separate semantic reference from runtime processing without mixing registry concerns with validator or serializer behavior.
What Does Not Belong Here
The registry does not decide:
- which validator function should be called
- which widget should be rendered
- which coercion behavior should be applied
- which shipping rules should be enforced
Those remain code responsibilities in the appropriate modules.
Module Relationship
wikibaseowns the package-facing access layer for the registry.fermenterconsumes normalized datatype semantics when validating and coercing values.bottlerconsumes datatype semantics when shaping claim payloads.shipperdepends on the canonical runtime datatype vocabulary for property creation and comparison.spirit_safeand later ontology-seed work may use the same registry to normalize authored datatype declarations into the runtime contract.
This separation keeps the datatype registry a stable guidepost rather than an overloaded processing layer.