Post-merge trust fixes: .gitattributes for SHA pinning, cells.jsonl drift, README/AUDIT consistency#19
Merged
Lightheartdevs merged 1 commit intoMay 17, 2026
Conversation
Five concrete reviewer nits flagged on PR #18 post-merge. Each is a real trust hit; all five are fixed here. 1. **.gitattributes added at repo root.** Without explicit rules, Windows reproducers checking out this repo with the default core.autocrlf=true silently converted JSONL line endings LF→CRLF on checkout, which broke the published SHA on workloads/prompts.jsonl ("the file the reviewer downloaded didn't match the SHA we published"). New .gitattributes pins *.jsonl, power.csv, thermals.csv, and *.log as `binary` (no transformation ever; SHAs match across platforms), forces LF on *.md/*.yaml/*.json/*.csv/*.sh/*.py and other text formats. 2. **prompts.jsonl.sha256 fixed to sha256sum --check format.** The file previously contained just the bare hash, which fails `sha256sum --check prompts.jsonl.sha256` with "no properly formatted checksum lines found". Updated to standard `<hash> <filename>` form so reproducers can verify directly. Applied to all four sha256 files: workloads/prompts.jsonl.sha256, harness/workloads/prompts.jsonl.sha256, harness/workloads/smoke-prompts.jsonl.sha256, and the bundle-level workloads/prompts.jsonl.sha256. 3. **cells.jsonl drift fixed.** The v1 aggregate had 35 rows for tower2/qwen3.6-27b/cuda while the filesystem (and manifest.json) said 36. The missing cell was ctx32768_gen2048_conc8 — the canonical engine-bound timeout cell where only 1 of 8 slots completed per batch. The harness aggregator silently drops rows with per_slot_decode_tps_mean=null, which is exactly what that cell produced. Backfilled the row with the available aggregate numbers from cell.json (aggregate_decode_tps_mean=0.87, batch_wall_s_mean=1200.10, cold_start_decode_tps=1.52), power/thermal stats derived from the cell's power.csv (gpu0 filter, n=11419) and thermals.csv (gpu0 sensor, n=10861), per_slot fields explicitly null, and a `notes` field explaining why per_slot is null and how the row was reconstructed. manifest.json's 36-cell count now matches aggregate; filesystem reality preserved. 4. **README/AUDIT.md drift on llama-server log publication fixed.** README.md said llama-server debug logs are NOT included (correct); AUDIT.md said they ARE published in the reproducibility bundle and listed them as a per-cell artifact. Resolved to README's position throughout AUDIT.md, with a new explicit "NOT included in the bundle (regeneratable)" section listing the per-cell llama-server log and the per-host build-*.log files, both regeneratable from the pinned source SHA in harness/VENDORED-FROM-SHA.txt. 5. **harness/README.md + harness/AUDIT.md de-drifted.** harness/README.md still described Strix Halo as "ROCm canonical" with the grid "running twice (ROCm and Vulkan)" — pre-bug-discovery framing that contradicted the bundle's findings.md (Vulkan is the canonical, working path; ROCm 6.4.4 segfaulted, see Finding 1). Updated the harness/README hosts table, the workload-size estimate, and the engines/ inventory comments to reflect Vulkan-canonical reality. harness/AUDIT.md was a full duplicate copy of the upstream bench-fleet AUDIT, with several pre-curation internal references (task #16, #20, #21, #37; "the user"; targets.json.broken_rocm_finding) that were sanitized in the bundle-level ../AUDIT.md during the curation pass but not propagated. Replaced harness/AUDIT.md with a one-paragraph pointer to ../AUDIT.md as single source of truth, so future curation passes have one file to update instead of two that can drift. Net effect: the bundle is consistent (manifest ↔ aggregate ↔ filesystem), the SHA pin is platform-neutral (Linux + Windows + macOS reproducers all get the same bytes), the README and the AUDIT agree on what's in the bundle and what isn't, and the harness docs reflect the actually-running configuration instead of the pre-bug plan. This commit adds no new bench data. The next round (MMBT Phase B Q8 quality companion, full sustained-thermal tier, PTX-JIT SOFT_MAX retry) ships as a separate PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Post-merge audit of #18 flagged five concrete trust hits. Each is a real reproducibility / consistency issue. All five fixed here.
The five fixes
1.
.gitattributesadded at repo rootWithout explicit rules, Windows reproducers checking out this repo with
core.autocrlf=true(the default) silently convert JSONL line endings LF→CRLF on checkout, breaking the published SHA onworkloads/prompts.jsonl. The reviewer's downloaded file genuinely didn't match the published hash. Now pinned:*.jsonl,power.csv,thermals.csv,*.log→binary(no transformation ever; SHAs match across platforms)*.md,*.yaml,*.json,*.csv(other than power/thermals),*.sh,*.py,*.ts,*.sha256→text eol=lf2.
prompts.jsonl.sha256insha256sum --checkformatPreviously just the bare hash (
9a27e...). Now standard<hash> <filename>form sosha256sum --check prompts.jsonl.sha256works directly. Applied to all four sha256 files in the bundle + harness:3.
cells.jsonldrift fixedThe v1
aggregate/cells.jsonlhad 35 rows fortower2/qwen3.6-27b/cudawhile filesystem +manifest.jsonboth said 36. Missing:ctx32768_gen2048_conc8— the engine-bound timeout cell where only 1 of 8 slots completed per batch. The harness aggregator silently drops rows withper_slot_decode_tps_mean=null, which is exactly what that cell produced.Row backfilled with:
aggregate_decode_tps_mean=0.87andbatch_wall_s_mean=1200.10fromcell.jsoncold_start_decode_tps=1.52,cold_start_wall_s=1200.22power_w_silicon_mean=183.15,power_w_silicon_max=390.79derived from this cell'spower.csv(gpu0 filter, n=11419 samples)temp_c_max=57.0from this cell'sthermals.csv(gpu0 sensor)per_slot_*fields explicitlynullnotesfield explaining the null + the reconstruction methodAfter fix:
cells.jsonl36 rows ↔ filesystem 36 dirs ↔manifest.json36 cells, internally consistent.4. README/AUDIT.md drift on
llama-server-*.logpublication resolvedREADME.mdsaid the per-cellllama-server-*.logdebug logs are not included (correct — they're 40 MB across 251 files and excluded for bundle size).AUDIT.mdsaid they are published in the reproducibility bundle and listed them as a per-cell artifact. Resolved AUDIT.md to README's position throughout, including a new explicit:5. Harness docs de-drifted
harness/README.mdstill described Strix Halo as "ROCm canonical" with the grid "running twice (ROCm and Vulkan)" — pre-bug-discovery framing that contradicted the bundle'sfindings.md(Vulkan is canonical, ROCm 6.4.4 segfaulted — see Finding 1). Updated the hosts table, workload-size estimate, andengines/inventory comments to reflect Vulkan-canonical reality.harness/AUDIT.mdwas a full duplicate copy of the upstream bench-fleet AUDIT, with several pre-curation internal references (task #16, #20, #21, #37; "the user";targets.json.broken_rocm_finding) that were sanitized in the bundle-level../AUDIT.mdduring PR #18 but not propagated to the harness copy. Replaced with a single-paragraph pointer to../AUDIT.mdas single source of truth, so future curation passes have one file to update instead of two that can drift.What this PR is NOT
No new bench data. The next round (MMBT Phase B Q8 quality companion, full 30-min sustained-thermal tier, PTX-JIT SOFT_MAX retry on Tower2 35B-A3B native CUDA) ships as a separate PR.
Verification
Test plan
git cloneand verifysha256sum --check prompts.jsonl.sha256passes (no CRLF conversion)ctx32768_gen2048_conc8row inaggregate/cells.jsonlagainsttower2/qwen3.6-27b/cuda/ctx32768_gen2048_conc8/cell.jsonAUDIT.mdno longer claims llama-server logs are publishedharness/README.mddescribes Vulkan as Strix canonicalharness/AUDIT.mdis a pointer stub to../AUDIT.md🤖 Generated with Claude Code