Methodology

Dataset versioning vs. editorial overlay

How Warconomy decides what belongs in the versioned, frozen dataset export and what stays an editorial overlay — with the freeze steps, the risks, and the current recommendation.

Warconomy has two layers. The versioned dataset export is a byte-frozen, diffable record (current version 1.187.0) whose live build must exactly match the committed frozen payload — so any change to it requires a deliberate version bump and re-freeze. On top sits an editorial overlay of public pages — briefings, hubs, the reader glossary, and the human-capital, macro, population and minerals layers — which iterate frequently and carry their own on-page provenance and caveats. This report explains which is which, the trade-offs of folding the overlay into the versioned export, and the current recommendation.

  • Current dataset export version: 1.187.0.
  • Versioned = byte-frozen, diffable, typed contract.
  • Editorial overlay = fast-iterating public pages with their own provenance.
  • Recommendation: keep the overlay editorial; cut a version only for stable structured records.

What is versioned (frozen export)

  • Countries, commodities, conflicts, chokepoints, sanctions subjects, metrics
  • Observations and source-linked facts, events
  • The source registry (src/data/sources.ts)
  • The citation-engine glossary (src/data/glossary.ts)
  • Per-version frozen payloads, diffs, provenance and checksums

What is an editorial overlay

  • Briefings (src/data/briefings.ts) and the briefings hub
  • Conflict and topic hubs
  • Conflict economics 101 / reader glossary (/learn)
  • Human-capital, conflict-economies and population-structure pages + their generated snapshots
  • Critical-minerals roadmap (/minerals)
  • The free-data roadmap and this decision report

Pros of folding overlay into the versioned export

  • One canonical, versioned, diffable record for consumers and AI search.
  • Reader glossary terms and new sources become part of the typed contract and CSV/JSON exports.
  • Provenance and change history extend to the new content.

Cons / risks of folding

  • frozen-payloads.test.ts asserts the live export byte-equals the frozen current-version payload, so ANY change requires a version bump + re-freeze.
  • A bump means: update DATASET_EXPORT_VERSION, regenerate and commit the frozen payload, refresh the manifest (hash/bytes/fieldCount), and run a blanket old→new version replace across the test fixtures.
  • Editorial copy changes frequently; folding it in would force a version cut for every wording tweak.
  • Higher risk of breaking the snapshot/diff/provenance machinery during fast content iteration.

When to cut a dataset version

  • A batch of stable, structured records is ready (e.g. a reviewed set of new sources or observations) — not prose that will keep changing.
  • You explicitly want the additions in the typed contract, CSV/JSONL and citation graph.
  • You can run the full freeze + validate:release in one deliberate 'version-cut' train.

Freeze steps (the version-cut checklist)

  • Bump DATASET_EXPORT_VERSION in src/lib/dataset.ts.
  • Add the new version to VERSION_SNAPSHOTS / FROZEN_PAYLOAD_VERSIONS and write the frozen payload body.
  • Refresh the frozen manifest (payloadHash, byteLength, fieldCount).
  • Blanket-replace the old version string across src/__tests__ fixtures.
  • Run npm run validate:snapshots, then npm run validate:release (build type-checks + full suite).
Current recommendationKEEP the new public layers (briefings, hubs, reader glossary, human-capital / macro / population / minerals overlays) as an EDITORIAL OVERLAY for now. They iterate frequently and carry their own provenance and caveats on-page, so the cost and risk of re-freezing the versioned export on every edit is not justified. Fold a curated batch of STABLE structured records (new registry sources, or promoted observations) into the versioned export only in a dedicated, low-risk 'version-cut' train that runs the full freeze steps. Do not bump the dataset version for prose or overlay changes.

Related

See the free data sources roadmap, the version changelog, the provenance surface, and /methodology/dataset-versioning/data.json.