harvard-lil
diff --git a/‎docs/adr/2026-06-22-vintage_audience_and_metadata_only_benchmark.md‎
Lines changed: 105 additions & 0 deletions b/‎docs/adr/2026-06-22-vintage_audience_and_metadata_only_benchmark.md‎
Lines changed: 105 additions & 0 deletions
diff --git a/‎docs/adr/README.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/adr/README.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎test-vectors/csv-vintage-benchmark/VINTAGE-IDEAL.md‎
Lines changed: 87 additions & 0 deletions b/‎test-vectors/csv-vintage-benchmark/VINTAGE-IDEAL.md‎
Lines changed: 87 additions & 0 deletions
diff --git a/‎test-vectors/csv-vintage-benchmark/expected-output/changelog.snap‎
Lines changed: 24 additions & 0 deletions b/‎test-vectors/csv-vintage-benchmark/expected-output/changelog.snap‎
Lines changed: 24 additions & 0 deletions
diff --git a/‎test-vectors/csv-vintage-benchmark/expected-output/changeset.snap‎
Lines changed: 163 additions & 0 deletions b/‎test-vectors/csv-vintage-benchmark/expected-output/changeset.snap‎
Lines changed: 163 additions & 0 deletions
@@ -0,0 +1,105 @@
+# The Vintage Audience: a Kept Benchmark for Metadata-Over-Data Reading
+
+**Date:** 2026-06-22
+**Status:** Accepted (benchmark landed; features deliberately deferred)
+
+## Context
+
+Binoc is tuned for one audience today: readers comparing two snapshots of the
+*same* dataset who want every change explained — every edited cell, every added
+row. Call them the **same-data** audience.
+
+There is a second, latent audience: readers comparing two **vintages** — two
+editions of the same published dataset (a yearly facilities register, a
+re-released survey). A vintage reader cares about the *shape* of the data: did a
+column appear, did a categorical vocabulary shift. They deliberately do **not**
+want to read the bulk cell/row churn — for them it is noise, possibly millions
+of rows of it.
+
+We are not ready to serve the vintage audience. The same-data experience needs
+to be bulletproof first, and inviting vintage feedback now would split our
+attention and our optics before the core is solid. But we do need confidence
+that the *engine* does not foreclose the vintage audience — that when we choose
+to open that channel, it is a matter of configuration and plugins, not a
+re-architecture. The risk we want to retire is a silent one: that some
+assumption baked into the controller, the IR, or the correspondence engine
+quietly assumes the same-data stance.
+
+## Decision
+
+Land a **kept benchmark vector**, `test-vectors/csv-vintage-benchmark`, that
+exercises the vintage stance end to end and stays green, rather than building any
+vintage feature. The vector is a two-CSV "published dataset" across two editions:
+`facilities.csv` gains a `region` column and one row's `status` moves to a new
+category value (`decommissioned`); `inspections.csv` changes only in its data
+(edited scores, appended rows). A markdown `groups` config expresses the vintage
+stance as significance — schema/vocabulary tags are the high-priority group, bulk
+cell/row tags the low-priority group.
+
+The benchmark confirms what already works: because significance is a renderer
+concern (per [2026-03-09 renderer config](2026-03-09-renderer_config.md)) and
+`classify_tags` promotes a node to the highest-priority group among its tags, the
+schema-touching `facilities.csv` floats up to the structural section while the
+pure-data `inspections.csv` sinks to the bulk section. That file-granularity
+separation is the best vintage view binoc offers today, and it is pure
+config — the type-ignorant controller (AGENTS rule 1) never participates.
+
+The vector ships two renderings side by side: `expected-output/changelog.snap`
+is the real, harness-checked engine output; `VINTAGE-IDEAL.md` is a hand-authored
+target that is *not* harness-checked. The benchmark is kept green so the gap
+between the two stays visible and measurable. The ideal names three gaps, each
+reachable without touching the controller, the IR, or the correspondence engine:
+
+1. **Within-node keep/drop.** A CSV's `region` addition and its `status` cell
+   edit are edits on one node, so the renderer cannot surface the structural
+   change while holding the cell back. The fix is a config-driven, edit-level
+   keep/drop filter in the renderer. The data path already carries
+   `EditProjection.visible`; today only writers set it. This is the smallest
+   unlock and it lives entirely in the renderer.
+
+2. **Vocabulary as a first-class change.** The `active -> decommissioned` shift
+   is reported as an ordinary `binoc.cell-change`, not as "the `status`
+   vocabulary gained a value." The fix is a plugin `EditListWriter` over
+   `tabular_v1` that diffs the distinct-value set of each categorical column and
+   emits a `binoc.vocabulary-change` edit — a plugin pack, exactly like the
+   standard library is (AGENTS rule 2).
+
+3. **Summary statistics over enumeration.** The bulk section enumerates every
+   changed cell and added row; a vintage reader wants "4 -> 6 rows, 3 cells
+   changed." The fix is the same plugin emitting an aggregate via
+   `Edit::with_summary` or a dataset-level `GlobalClaim`. The seam already
+   carries such facts — stdlib uses `with_summary` for binary string-diffs — but
+   no rule emits a tabular roll-up yet.
+
+The conclusion we are recording: the vintage-vs-same-data distinction is a
+renderer-config + plugin-pack concern, which is the architecture's whole thesis.
+The minimum to open the channel is one renderer-local filter plus one plugin
+pack. No engine surgery. The channel is provably clear, and we are choosing not
+to walk through it yet.
+
+## Alternatives Considered
+
+**Build the metadata-only filter and a sample statistics plugin now.** This is
+the natural next step and the benchmark is designed to make it cheap. We are
+deferring it for social and focus reasons, not technical ones: shipping a vintage
+surface would invite vintage feedback before the same-data experience is solid.
+The benchmark captures the design so the work is shovel-ready when we choose it.
+
+**Write the rationale as prose only, with no vector.** A document can claim the
+engine is ready; a passing benchmark proves it and keeps proving it. Without an
+executable artifact, a future change could quietly regress the vintage stance
+(e.g., bake a same-data assumption into a writer) with nothing to catch it.
+
+**Make the benchmark aspirational — hand-author the ideal as the gold file.**
+A snapshot that encodes output the engine does not produce would fail CI, forcing
+us to either disable the test (dead weight) or special-case it (harness
+complexity). Instead the harness-checked snapshot tracks reality and a separate,
+unchecked `VINTAGE-IDEAL.md` holds the target. The benchmark stays honest and
+green, and the gap is documented rather than asserted.
+
+**Promote columns (and their vocabularies) to first-class IR nodes now.** This
+would make within-node significance and vocabulary diffing fall out naturally,
+but it is a substantial IR change in service of an audience we are deliberately
+not yet serving. The benchmark shows the same outcomes are reachable with a
+plugin writer emitting tagged edits, deferring any IR commitment until the
+vintage audience is real.
@@ -6,6 +6,7 @@ Newer entries appear first. Each entry shows its date and current status. Create
 
 | Date | Title | Status |
 |---|---|---|
+| 2026-06-22 | [The Vintage Audience: a Kept Benchmark for Metadata-Over-Data Reading](2026-06-22-vintage_audience_and_metadata_only_benchmark.md) | Accepted (benchmark landed; features deferred) |
 | 2026-06-15 | [Tiered Artifact Metadata: Column, Table, and a `parser_metadata_v1` Artifact](2026-06-15-tiered_artifact_metadata.md) | Implemented (channels + producers in CFM-80; rendering + significance in CFM-82) |
 | 2026-06-15 | [The Engine Overhaul, Told Whole: Single-Tree to Correspondence-First](2026-06-15-engine_overhaul_retrospective.md) | Retrospective |
 | 2026-06-15 | [Partition Identities: a JIT, Format-Owned Capability for N↔M Correspondence (CFM-72)](2026-06-15-partition_identities_jit_format_capability.md) | Implemented |
 
@@ -0,0 +1,87 @@
+# Vintage benchmark — target experience
+
+This file is the north star for the *vintage* (different-edition) audience. It is
+**not** checked by the harness; it is the hand-authored target that
+`expected-output/changelog.snap` should converge toward as the vintage story
+improves. Compare the two whenever you touch tabular significance, vocabulary
+detection, or summary statistics.
+
+A vintage reader is comparing two editions of the same published dataset. They
+care about the *shape* of the data — did a column appear, did a category
+vocabulary shift — and they deliberately do **not** want to read the bulk
+cell/row churn. (This is the opposite stance from the same-data-with-edits
+reader binoc is primarily tuned for today, who wants every cell.)
+
+## What binoc renders today
+
+See `expected-output/changelog.snap`. Abbreviated:
+
+```
+## Schema & vocabulary changes
+- facilities.csv: Column added: 'region'; 1 cell changed
+  - row 2, column 'status': 'active' -> 'decommissioned'
+  - Set Headers: ...; Add Column: 'region' ...
+## Bulk data updates
+- inspections.csv: 2 rows added; 3 cells changed
+  - row 1, column 'score': '82' -> '85'
+  - ... every changed cell and added row, in full ...
+```
+
+The file-level separation is right. Three things fall short.
+
+## What great looks like
+
+```
+# Changelog: 2021 edition → 2022 edition
+
+## Schema & vocabulary changes
+- facilities.csv
+  - Column added: 'region'  (4 values: north, east, south, west)
+  - Vocabulary 'status' gained a value: 'decommissioned'
+    (now: active, inactive, decommissioned)
+
+## Bulk data updates — summarized, not enumerated
+- facilities.csv:  4 rows, 1 cell changed
+- inspections.csv: 4 → 6 rows (+2), 3 cells changed
+```
+
+## The three gaps between today and the target
+
+1. **Within-node significance / edit-level keep-drop.**
+   `facilities.csv`'s `region` addition and its `status` cell edit are edits on
+   one node, so the renderer cannot put the structural change in the top section
+   and hold the cell back. The vintage reader still sees the cell bullet.
+   *Needs:* a config-driven, edit-level drop/keep on the renderer (the data path
+   already has `EditProjection.visible`, but only writers set it). This is the
+   single smallest unlock and it lives entirely in the renderer — no engine or
+   IR change.
+
+2. **Vocabulary as a first-class change.**
+   `active → decommissioned` is reported as `binoc.cell-change`, not "the
+   `status` vocabulary gained a value." Columns are not first-class nodes and
+   distinct-value-set diffing does not exist.
+   *Needs:* a plugin `EditListWriter` over `tabular_v1` that computes the set of
+   distinct values per categorical column on each side and emits the set-delta
+   as a tagged edit (`binoc.vocabulary-change`). No engine change — a plugin
+   pack, exactly like the standard library is.
+
+3. **Summary statistics instead of enumeration.**
+   The bulk section dumps every changed cell and added row. A vintage reader
+   wants "4 → 6 rows, 3 cells changed."
+   *Needs:* the same plugin writer emitting an aggregate via `Edit::with_summary`
+   (or `GlobalClaim` for a dataset-level roll-up). The seam already carries such
+   facts — binoc-stdlib uses `with_summary` for binary string-diffs today; no
+   rule emits a tabular roll-up yet.
+
+## Why this benchmark exists
+
+It demonstrates that the *engine* does not foreclose the vintage audience: the
+target above is reachable with (1) one renderer-local keep/drop filter and (2)
+one plugin pack that emits vocabulary + statistic facts — no change to the
+type-ignorant controller, the IR, or the correspondence engine. The vintage vs.
+same-data distinction is a renderer-config + plugin-pack concern, which is the
+architecture's whole thesis (AGENTS rules 1 and 3).
+
+It is kept as a passing benchmark so the gap stays visible and measurable. We
+are deliberately **not** building the unlocks yet (we want to nail the
+same-data audience first), but the channel is provably clear.
@@ -0,0 +1,24 @@
+---
+source: binoc-stdlib/src/test_vectors.rs
+expression: "&md"
+---
+# Changelog: snapshot-a → snapshot-b
+
+## Schema & vocabulary changes
+
+- **facilities.csv**: Column added: 'region'; 1 cell changed
+  - Changed cells
+    - row 2, column 'status': 'active' -> 'decommissioned'
+  - Set Headers: from: ["facility_id","name","status"]; to: ["facility_id","name","status","region"]
+  - Add Column: name: 'region'; values: {"total_values":4,"truncated":false,"values":["north","east","west","south"]}
+
+## Bulk data updates
+
+- **inspections.csv**: 2 rows added; 3 cells changed
+  - Changed cells
+    - row 1, column 'score': '82' -> '85'
+    - row 3, column 'score': '90' -> '91'
+    - row 4, column 'score': '68' -> '70'
+  - Rows added
+    - row 5: 'I104', 'F001', '88'
+    - row 6: 'I105', 'F002', '73'
@@ -0,0 +1,163 @@
+---
+source: binoc-stdlib/src/test_vectors.rs
+expression: "&stable_changeset"
+---
+{
+  "from_snapshot": "snapshot-a",
+  "to_snapshot": "snapshot-b",
+  "claims": [],
+  "root": {
+    "action": "modify",
+    "item_type": "directory",
+    "path": "",
+    "children": [
+      {
+        "action": "modify",
+        "item_type": "tabular",
+        "path": "facilities.csv",
+        "sources": [
+          {
+            "path": "facilities.csv",
+            "side": "from",
+            "evidence": "binoc.pair.name",
+            "action": "modify"
+          }
+        ],
+        "summary": [
+          {
+            "text": "Column added: 'region'; 1 cell changed"
+          }
+        ],
+        "tags": [
+          "binoc.cell-change",
+          "binoc.column-addition",
+          "binoc.schema-change"
+        ],
+        "details": {
+          "edits": [
+            {
+              "params": {
+                "from": [
+                  "facility_id",
+                  "name",
+                  "status"
+                ],
+                "to": [
+                  "facility_id",
+                  "name",
+                  "status",
+                  "region"
+                ]
+              },
+              "verb": "tabular.set_headers"
+            },
+            {
+              "params": {
+                "name": "region",
+                "values": {
+                  "total_values": 4,
+                  "truncated": false,
+                  "values": [
+                    "north",
+                    "east",
+                    "west",
+                    "south"
+                  ]
+                }
+              },
+              "verb": "tabular.add_column"
+            },
+            {
+              "params": {
+                "column": "status",
+                "from": "active",
+                "row": 1,
+                "to": "decommissioned"
+              },
+              "verb": "tabular.edit_cell"
+            }
+          ]
+        }
+      },
+      {
+        "action": "modify",
+        "item_type": "tabular",
+        "path": "inspections.csv",
+        "sources": [
+          {
+            "path": "inspections.csv",
+            "side": "from",
+            "evidence": "binoc.pair.name",
+            "action": "modify"
+          }
+        ],
+        "summary": [
+          {
+            "text": "2 rows added; 3 cells changed"
+          }
+        ],
+        "tags": [
+          "binoc.cell-change",
+          "binoc.row-addition"
+        ],
+        "details": {
+          "edits": [
+            {
+              "params": {
+                "column": "score",
+                "from": "82",
+                "row": 0,
+                "to": "85"
+              },
+              "verb": "tabular.edit_cell"
+            },
+            {
+              "params": {
+                "column": "score",
+                "from": "90",
+                "row": 2,
+                "to": "91"
+              },
+              "verb": "tabular.edit_cell"
+            },
+            {
+              "params": {
+                "column": "score",
+                "from": "68",
+                "row": 3,
+                "to": "70"
+              },
+              "verb": "tabular.edit_cell"
+            },
+            {
+              "params": {
+                "rows": [
+                  {
+                    "total_values": 3,
+                    "truncated": false,
+                    "values": [
+                      "I104",
+                      "F001",
+                      "88"
+                    ]
+                  },
+                  {
+                    "total_values": 3,
+                    "truncated": false,
+                    "values": [
+                      "I105",
+                      "F002",
+                      "73"
+                    ]
+                  }
+                ],
+                "start": 4
+              },
+              "verb": "tabular.append_rows"
+            }
+          ]
+        }
+      }
+    ]
+  }
+}