Skip to content

Commit b52a15e

Browse files
ruvnetruvnet
andcommitted
chore(release): bump workspace to 2.3.0
Covers the ruvllm GPU optimization sweep (ADR-258 + post-merge): - RDT / OpenMythos model (PR #589) - Vectorized ACT halting — 4-21× GPU prefill speedup - candle 0.9 + cudarc 0.19 (CUDA 13.0 native, RTX 5080 / SM 12.0) - KV cache pre-allocation (GqaPrealloc, MlaPrealloc, RdtKvCache::Prealloc) - On-device argmax (128KB→4B), GPU top-k sort (128KB→320B) - Fused ACT CUDA kernel via nvrtc + zero-copy tensor pointer path - True per-token streaming, RDT generate_sampled Co-Authored-By: claude-flow <ruv@ruv.net>
1 parent 920d8cc commit b52a15e

2 files changed

Lines changed: 132 additions & 132 deletions

File tree

0 commit comments

Comments
 (0)