Commit 14ed07e
committed
feat: Add AD-23 Phase-1 distillation, expert cache, and DDD updates
AD-23: Phase-1 Distillation via External GPU Teacher Artifacts
- One-time GPU job produces behavioral artifacts (routing traces,
sparse logits, preference labels) — not trained weights
- CPU-only refinement: router repair, LoRA correction, EWC++, policy
optimization using teacher artifacts
- Acceptance criteria: 200-prompt suite, all 3 behavioral gates,
stability under 10% corpus perturbation
expert_cache.rs: MoE expert hot-set caching (new file)
- ExpertCache with LRU/LFU/Adaptive eviction policies
- MoeBatchScheduler: reorder token execution by expert for cache reuse
- Prefetcher trait for future platform-specific prefetch intrinsics
- 12 tests (92/92 bitnet tests pass)
DDD v2.5: 6 new ubiquitous language terms (Teacher Artifact, Behavioral
Distillation, Router Repair, Sparse Logits, Corpus Perturbation) and
4 new open questions (#27-30) for Phase-1 operability.
https://claude.ai/code/session_011nTcGcn49b8YKJRVoh4TaK1 parent 7d8724a commit 14ed07e
4 files changed
Lines changed: 1153 additions & 0 deletions
File tree
- crates/ruvllm/src/bitnet
- docs
- adr
- research
0 commit comments