Methodology

Manual transcription checklist

The fields a human maintainer records when transcribing a value from a binary/PDF/XLSX or HTML source — source URL, publisher, exact location, displayed value, unit, period, a mandatory second pass, and why the value is acceptable. No scraping, no automated parsing.

static reference · data June 5, 2026

Many authoritative sources publish values only in PDFs or spreadsheets. The only sanctioned way to use them is for a human maintainer to read the exact value and record the 15 fields below — 13 required — with a mandatory second pass. No code parses a binary file; there is no scraping and no automation. If a value cannot be read exactly and directly, it is not added.

  • 15 checklist fields (13 required); second pass mandatory.
  • Binary/PDF/XLSX values are read by a human, not scraped or parsed by code.
  • Machine-readable at /methodology/manual-transcription/data.json.

Checklist

FieldRequirementRequired
sourceUrlThe canonical URL of the source (page or downloadable file).
publisherThe publishing body (e.g. SIPRI, World Bank, European Commission).
reportTitleThe exact report or file title.
locationPage / sheet / table / cell reference, when the value is in a document.
displayedValueThe exact value as displayed by the source (no rounding beyond the source).
unitThe unit exactly as the source states it.
periodThe reporting period / asOf date the value refers to.
valueFormatHow the value was read: text | table | cell | direct-download.
secondPassA second, independent re-read confirming the transcription.
sourceIdThe Source id this value attaches to (existing or new).
confidencehigh | medium | low, per source authority and clarity.
caveatThe applicable caveat(s) — not real-time, associative not causal, etc.
acceptableReasonWhy the value is acceptable despite the source format (e.g. exact cell directly read).
noAutomationReasonConfirmation that the value was read manually — no scraping or automated parsing.
fileNoteA note on the file/screenshot if relevant (no committed screenshots unless already project convention).

Refresh harness: /methodology/refresh-harness · source hierarchy: /methodology/source-hierarchy · machine-readable: /methodology/manual-transcription/data.json.

Related Warconomy pages