Shipping: Submission & Receipt
Overview
Shipping is the stage that handles submission of bottled and packaged data to components of the Global Knowledge Commons - Wikidata, OpenStreetMap, Wikimedia Commons, etc. Not all data gets shipped; some of it relies on hand delivery such as filled templates in Wikipedia.
| Aspect | Details |
|---|---|
| Input | Bottled data transformed into the proper formats for target systems |
| What Happens | Validation against Barrel Profiles, interaction with write APIs, transmission of data, receipt of delivery |
| Output | Receipt notices suitable for annotation for source systems and users |
| Best For | Final distribution of finished products (until the next distillation) |
| Typical Duration | Seconds to hours, depending on receiving capacity and amount being shipped |
The Problem Shipping Solves
Delivery to consumers:
- Shipping docks are all different High variability in write APIs
- Final distribution varies Wikimedia and OSM and others all have different write APIs
- API rate limits must be respected
- Receipts document point in time state successful placement of content recorded
The cost of skipping Shipping: Data isn't available in the Global Knowledge Commons until this happens
Input Contract
Records entering Shipping should:
- Be bottled into the appropriate format — Only records that pass structure/content validation
- Distinguish new or updates — QIDs, OSM IDs, etc.
- Include provenance — References and source metadata
- Be validated — No known schema errors or conflicts
Supporting Systems
Barrel (Provenance)
Stores: - History of actions suitable for annotation on shipped product
Spirit Safe (Validation)
Provides: - Final output models to verify the shippable product
Relationship to Other Stages
After: Data are available across components of the Commons
Common Patterns
Pattern 1: "I need a dry-run first"
Generate export files and reports without uploading. Use diff or review to confirm before pushing to target systems.
Pattern 2: "I need multiple outputs"
Configure Bottling to output in multiple formats at once (Wikidata JSON + Wikipedia infobox + OSM tags).
Pattern 3: "I want to export only high-confidence records"
Filter by _proofing_status == "pass" or a minimum quality threshold.
Reference
- API Reference: Shipper API
- Target API Specs: -- MediaWiki Action API; overall framework for interacting with any MediaWiki instance -- MediaWiki Wikibase extension API documentation; includes several actions such as wbeditentity, wbsetclaim, etc. -- OpenStreetMap API -- OpenStreetMap API v0.6 Specifications
GitHub Issues & Development
Work on the Shipping stage is tracked under the ship label.
Other Workflow Stages:
- mash — Data Ingestion
- ferment — Cleaning & Normalization
- distill — Reconciliation & Linking
- refine — Deduplication & Enrichment
- proof — Quality Assurance
- blend — Multi-Source Merging
- bottle — Bottling & packaging