Data Distillery

The Data Distillery project is an experimental framework for data cleaning, integration, and enhancement workflows geared toward data and information being contributed to the Global Knowledge Commons. That's the term of art I've adopted for the suite of open knowledge assets maintained by all of us in the commons - Wikidata and other parts of the Wikiverse, OpenStreetMap, and a handful of other platforms and helper services like ORCID. I'm writing some new software that I'll post about here shortly.

Global Knowledge Commons Python package

I've started work porting what I used to do in Wikidata and Wikibase instances into a larger framework to support a more robust set of data processing steps and contributions across Wikimedia projects and OpenStreetMap. I've called it gkc and started building architecture on the data distillery theme. Read the docs here.

Data Curator Narratives

I've started an initial narrative text about the bigger picture work I'm trying to achieve and how I'd like the gkc Python package and perhaps a larger Data Distillery capability based on that to help me do work in the Commons. I'm using these to help ground and guide the development process.

Federally recognized Tribal Governments in the U.S.