Mash Commands
Plain meaning: Load source data as ingredients for further actions.
Overview
The mash module in GKC handles the input of various data and information content and structure that will be processed through data distillery workflows. The mash CLI provides an interface to load Wikidata entities - items (QID), properties (PID), EntitySchemas (EID) - as well as Wikipedia templates.
The name "mash" comes from the distillery metaphor—like grain that's been milled and steeped to extract fermentable sugars, mashed entities extract the essential structure and content from source data, readying the ingredients for further processing.
Current implementations: - Wikidata items (QID) - Wikidata properties (PID) - Wikidata EntitySchemas (EID) - Wikipedia templates
Future implementations: CSV files, JSON APIs, dataframes
Load Wikidata Items by QID
gkc mash qid <QID> [options]
Load one or more Wikidata items by QID and output them in various formats.
Arguments
qid: Positional argument for a single item ID (e.g.,Q42)--qid <QID>: Repeatable flag for multiple items (e.g.,--qid Q42 --qid Q5)--qid-list <file>: Path to file containing item IDs (one per line)
Output Options
-o, --output <file>: Write output to file instead of stdout--raw: Output raw JSON to stdout (default behavior when no transform specified)--summary: Output a summary of the item(s) with labels, descriptions, and statement count--transform <type>: Transform the outputshell: Strip all identifiers for new item creationqsv1: Convert to QuickStatements V1 formatgkc_entity_profile: Convert to GKC Entity Profile (not yet implemented)
Filtering Options
--include-properties <P1,P2,...>: Comma-separated list of properties to include--exclude-properties <P1,P2,...>: Comma-separated list of properties to exclude--exclude-qualifiers: Omit all qualifiers from output--exclude-references: Omit all references from output--no-entity-labels: Skip fetching entity labels for QuickStatements comments (faster)
Examples
Load a single item and display summary
gkc mash qid Q42 --summary
Output: JSON summary with labels, descriptions, and statement count
Load a single item (raw JSON)
gkc mash qid Q42 --raw
Output: Raw Wikidata JSON for item Q42
Load multiple items
# Using repeatable --qid flags
gkc mash qid --qid Q42 --qid Q5 --qid Q30
# Using a file list
echo "Q42
Q5
Q30" > items.txt
gkc mash qid --qid-list items.txt
Transform to shell for new item creation
gkc mash qid Q42 --transform shell -o new_item_template.json
Strips all identifiers (id, pageid, ns, title, statement IDs, hashes) to create a clean template for submitting as a new item.
Transform to QuickStatements
# For editing existing item
gkc mash qid Q42 --transform qsv1
# Extract specific properties only
gkc mash qid Q42 --transform qsv1 --include-properties P31,P21,P569
Filter properties and save
gkc mash qid Q42 \
--exclude-properties P18,P373 \
--exclude-qualifiers \
--exclude-references \
-o filtered_item.json
Load Wikidata Properties by PID
gkc mash pid <PID> [options]
Load one or more Wikidata properties by PID and output them in various formats.
Arguments
pid: Positional argument for a single property ID (e.g.,P31)--pid <PID>: Repeatable flag for multiple properties (e.g.,--pid P31 --pid P279)--pid-list <file>: Path to file containing property IDs (one per line)
Output Options
-o, --output <file>: Write output to file instead of stdout--raw: Output raw JSON to stdout (default behavior)--summary: Output a summary of the property(ies) with labels, descriptions, and datatype--transform <type>: Transform the outputshell: Strip all identifiers for new property creationgkc_entity_profile: Convert to GKC Entity Profile (not yet implemented)
Examples
Load a single property and display summary
gkc mash pid P31 --summary
Output: JSON summary with labels, descriptions, datatype, and formatter URL
Load a single property (raw JSON)
gkc mash pid P31 --raw
Output: Raw Wikidata JSON for property P31
Load multiple properties
# Using repeatable --pid flags
gkc mash pid --pid P31 --pid P279 --pid P21
# Using a file list
echo "P31
P279
P21" > properties.txt
gkc mash pid --pid-list properties.txt
Transform to shell for new property creation
gkc mash pid P31 --transform shell -o new_property_template.json
Load Wikidata EntitySchemas by EID
gkc mash eid <EID> [options]
Load a Wikidata EntitySchema by EID and output it in various formats.
Arguments
eid: The EntitySchema ID (e.g.,E502)
Output Options
-o, --output <file>: Write output to file instead of stdout--raw: Output raw JSON to stdout (default behavior)--summary: Output a summary of the EntitySchema with labels and descriptions--transform <type>: Transform the outputshell: Strip all identifiers for new EntitySchema creationgkc_entity_profile: Convert to GKC Entity Profile
Examples
Load an EntitySchema and display summary
gkc mash eid E502 --summary
Output: JSON summary with labels, descriptions, and schema text length
Load an EntitySchema (raw JSON)
gkc mash eid E502 --raw
Output: Raw EntitySchema JSON including labels, descriptions, and ShEx schema text
Transform to GKC Entity Profile
gkc mash eid E502 --transform gkc_entity_profile -o tribe_profile.json
Converts the EntitySchema's ShEx specification into a GKC Entity Profile that can be used for data validation and transformation.
Transform to shell for reuse
gkc mash eid E502 --transform shell -o new_schema_template.json
Load Wikipedia Templates
gkc mash wp_template <TEMPLATE_NAME> [options]
Load a Wikipedia template from en.wikipedia.org and output it in various formats.
Arguments
template_name: The Wikipedia template name (e.g.,Infobox_settlement)
Output Options
-o, --output <file>: Write output to file instead of stdout--raw: Output raw JSON response from the Wikimedia API- Default (no flags): Output summary of the template with title, description, and parameter count
Examples
Load a Wikipedia template and display summary
gkc mash wp_template Infobox_settlement
Output:
{
"title": "Infobox settlement",
"description": "An infobox used to summarize information about places or geographic entities",
"param_count": 47
}
Get the full template structure
gkc mash wp_template Infobox_settlement --raw -o settlement_template.json
Output: Full JSON structure including title, description (multilingual), params, and paramOrder
Explore template parameters
gkc mash wp_template Infobox_settlement --raw | jq '.paramOrder[:10]'
Lists the first 10 parameters in order, useful for understanding template structure.
Batch Processing Patterns
Load multiple items from a file
# Create a file with QIDs
cat > items.txt <<EOF_MARKER
Q42
Q5
Q30
# Comments are ignored
Q515
EOF_MARKER
# Process all items
gkc mash qid --qid-list items.txt -o batch_items.json
Transform multiple items to QuickStatements
# Load multiple items and convert to QS for batch editing
gkc mash qid --qid Q42 --qid Q5 --transform qsv1 -o batch_statements.qs
Property metadata extraction
# Extract metadata for a set of properties
cat > props.txt <<EOF_MARKER
P31
P279
P21
P569
P570
EOF_MARKER
gkc mash pid --pid-list props.txt -o property_metadata.json
Output Formats
Raw JSON (default)
The raw Wikidata entity JSON as returned by the API. This format preserves all structure and identifiers, suitable for: - Round-trip processing - Integration with other tools - Detailed inspection
Shell (--transform shell)
Strips all system identifiers and metadata:
- Removes: id, pageid, lastrevid, modified, ns, title
- Removes: statement IDs (GUIDs)
- Removes: all hashes from snaks, qualifiers, and references
Use this when creating templates for new entity creation on Wikidata or Wikibase instances.
QuickStatements V1 (--transform qsv1, items only)
Converts item data to QuickStatements V1 format for bulk operations:
- Format: QID|property|value
- Includes property labels as comments for readability
- Use on QuickStatements for batch editing
GKC Entity Profile (--transform gkc_entity_profile, EntitySchemas only)
Converts EntitySchemas to GKC Entity Profiles: - Extracts properties and constraints from ShEx - Creates portable profiles for validation and transformation - Currently only implemented for EntitySchemas
Common Workflows
Creating Similar Items
- Find anexemplar item on Wikidata (e.g., Q42)
- Load with shell transform:
gkc mash qid Q42 --transform shell -o template.json - Edit the template JSON to modify labels/values
- Submit to Wikidata using the wbeditentity API or QuickStatements
Property Documentation
# Extract metadata for all properties in a domain
gkc mash pid --pid-list biological_properties.txt -o bio_props.json
EntitySchema Development
# Load existing schema as starting point
gkc mash eid E502 --transform shell -o new_schema.json
# Or convert to profile for analysis
gkc mash eid E502 --transform gkc_entity_profile -o tribe_profile.json
Migration from Previous CLI
The mash CLI has been refactored for consistency. Key changes:
Old:
gkc mash qid Q42 --output qsv1 --new
gkc mash qid Q42 --output json
New:
gkc mash qid Q42 --transform qsv1
gkc mash qid Q42 # raw JSON is default
gkc mash qid Q42 --transform shell # for new items
Changes:
- --output now means output file path, not format
- --transform specifies the transformation type
- --new flag removed (use --transform shell or --transform qsv1 with for_new_item)
- --save-profile replaced with -o, --output
- Added support for batch loading with --qid-list, --pid-list
- Added mash pid command for properties
- Simplified mash eid command
Related Documentation
- Mash API - Python API for programmatic use
- QuickStatements Documentation - External tool for batch operations