Commit c065a95
committed
feat: Implement BitNet inference stack — TL1 kernel, backend, GGUF export, RLM refiner
Phase 0 + 0.5 implementation (4,283 lines across 6 new files):
- tl1_kernel.rs (879L): TL1 ternary GEMV with NEON SIMD + scalar fallback,
INT8 activation quantization (absmax), LUT generation, 17 tests
- backend.rs (1,179L): Full BitNetBackend implementing LlmBackend trait,
GGUF model loading, MoE router (softmax gate + top-K), expert FFN
(SwiGLU via TL1 GEMV), RMSNorm, embedding/LM head, 12 tests
- gguf_export.rs (662L): GGUF v3 writer for BITNET_T158, FP16 conversion,
model export with BitNet metadata, validation, 8 tests
- rlm_refiner.rs (696L): Phase 0.5 orchestrator wiring MicroLoRA + EWC++ +
GRPO + ContrastiveTrainer, SIMD-only mode (AD-20), checkpointing, 10 tests
- tl1_avx2.rs (414L): AVX2 SIMD kernel variant (x86_64 conditional)
- tl1_wasm.rs (453L): WASM SIMD128 kernel variant (wasm32 conditional)
All 72 bitnet tests pass. Fixed 2 pre-existing compilation errors in
autodetect.rs and kernels/mod.rs.
https://claude.ai/code/session_011nTcGcn49b8YKJRVoh4TaK1 parent 2933904 commit c065a95
10 files changed
Lines changed: 4307 additions & 12 deletions
File tree
- crates/ruvllm
- src
- bitnet
- kernels
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
| 118 | + | |
118 | 119 | | |
119 | 120 | | |
120 | 121 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
432 | 432 | | |
433 | 433 | | |
434 | 434 | | |
435 | | - | |
436 | | - | |
437 | | - | |
438 | | - | |
439 | | - | |
440 | | - | |
441 | | - | |
442 | | - | |
443 | | - | |
444 | | - | |
| 435 | + | |
| 436 | + | |
445 | 437 | | |
446 | 438 | | |
447 | 439 | | |
| |||
0 commit comments