Skip to content

Commit c065a95

Browse files
committed
feat: Implement BitNet inference stack — TL1 kernel, backend, GGUF export, RLM refiner
Phase 0 + 0.5 implementation (4,283 lines across 6 new files): - tl1_kernel.rs (879L): TL1 ternary GEMV with NEON SIMD + scalar fallback, INT8 activation quantization (absmax), LUT generation, 17 tests - backend.rs (1,179L): Full BitNetBackend implementing LlmBackend trait, GGUF model loading, MoE router (softmax gate + top-K), expert FFN (SwiGLU via TL1 GEMV), RMSNorm, embedding/LM head, 12 tests - gguf_export.rs (662L): GGUF v3 writer for BITNET_T158, FP16 conversion, model export with BitNet metadata, validation, 8 tests - rlm_refiner.rs (696L): Phase 0.5 orchestrator wiring MicroLoRA + EWC++ + GRPO + ContrastiveTrainer, SIMD-only mode (AD-20), checkpointing, 10 tests - tl1_avx2.rs (414L): AVX2 SIMD kernel variant (x86_64 conditional) - tl1_wasm.rs (453L): WASM SIMD128 kernel variant (wasm32 conditional) All 72 bitnet tests pass. Fixed 2 pre-existing compilation errors in autodetect.rs and kernels/mod.rs. https://claude.ai/code/session_011nTcGcn49b8YKJRVoh4TaK
1 parent 2933904 commit c065a95

10 files changed

Lines changed: 4307 additions & 12 deletions

File tree

crates/ruvllm/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,7 @@ async-runtime = ["tokio", "tokio-stream"]
115115
# Minimal build without inference (for embedding/library use only)
116116
minimal = ["async-runtime"]
117117
wasm = []
118+
wasm-simd = []
118119

119120
# Ruvector integration features
120121
attention = ["dep:ruvector-attention"]

crates/ruvllm/src/autodetect.rs

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -432,16 +432,8 @@ impl GpuCapabilities {
432432
return Self::detect_webgpu();
433433
}
434434

435-
#[cfg(not(any(
436-
target_os = "macos",
437-
target_os = "ios",
438-
target_os = "linux",
439-
target_os = "windows",
440-
target_arch = "wasm32"
441-
)))]
442-
{
443-
None
444-
}
435+
#[allow(unreachable_code)]
436+
None
445437
}
446438

447439
/// Detect Metal GPU capabilities

0 commit comments

Comments
 (0)