Commit 6f7c864
chore(release): 0.55.0 — convert/export runnable models + GPU parity reconciled + autograd training proven (PMAT-918..921) (#2218)
Version bump 0.54.0 → 0.55.0 across all workspace Cargo.toml (101 ecosystem
version pins) + Cargo.lock regen + CHANGELOG. The v0.55.0 wave = 6 merged beats:
- #2208 — degate the duckdb competitive bench behind a feature; cold merge_group
builds were intermittently failing the merge queue while PR heads were green.
- #2209 (PMAT-918) — apr convert --quantize q4k now synthesizes the tied lm_head
for tied-embedding models; the Q4K save path produced a non-runnable .apr that
failed at load with "tensor not found: lm_head.weight".
- #2210 (PMAT-919) — GPU/CPU parity gate reconciled against ground truth
(llama.cpp, per-position): fp32-Mwv is the correct Blackwell default; HwDp4a is
genuinely degraded. F2 gate now checks per-position argmax-match + min-cosine
over positions >=1, replacing the last-token-only check.
- #2211 — gate the coop_gemm_bench example behind opt-in cooperative-matrix;
wgpu 27 dropped the Vulkan cooperative-matrix path, breaking --all-targets.
- #2212 (PMAT-920) — apr export --format gguf now uses explicit head_dim for exact
num_heads, and hard-fails with an actionable error instead of silently stamping
a wrong num_heads into a valid-looking GGUF.
- #2213 (PMAT-921) — fix the transformer FFN gelu severing the autograd graph
(functional::gelu builds output via Tensor::from_vec with no grad_fn), plus a
new end-to-end train-to-loss proof that catches the severed-graph class.
Version-bump PR only. crates.io publish + git tag + GH release are deferred to a
separate human-gated step (do NOT run make publish / cargo publish from this PR).
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent 53c9b54 commit 6f7c864
35 files changed
Lines changed: 234 additions & 179 deletions
File tree
- crates
- apr-cli
- aprender-bench-compute
- aprender-bench-tokenizer
- aprender-cbtop
- aprender-cgp
- aprender-compute
- aprender-contracts-cli
- aprender-contracts
- aprender-cuda-edge
- aprender-explain
- aprender-graph
- aprender-monte-carlo
- aprender-orchestrate
- aprender-profile
- aprender-qa-cli
- aprender-qa-report
- aprender-qa-runner
- aprender-rag
- aprender-registry
- aprender-serve
- aprender-shell
- aprender-simulate
- aprender-test-cli
- aprender-train-bench
- aprender-train-distill
- aprender-train-inspect
- aprender-train-lora
- aprender-train-shell
- aprender-train
- aprender-tsp
- aprender-verify-ml
- aprender-viz
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
10 | 65 | | |
11 | 66 | | |
12 | 67 | | |
| |||
0 commit comments