Commit 8093376

committed

feat: Add GLM-4.7-Flash GGUF tensor mapping, MLA attention, and model validation

- TensorNameMapper resolves both llama.cpp (blk.*) and HuggingFace (model.layers.*) naming - MLA (Multi-Head Latent Attention) with low-rank Q/KV compression (DeepSeek-V2 style) - Stacked 3D expert tensor support (ffn_gate_exps → per-expert slicing) - Shared expert + dense layer-0 support (MoeWithShared/Dense/Moe layer types) - Updated BitNetModelConfig defaults to match GLM-4.7-Flash architecture - Tensor discovery and model validation harness for GGUF files - 188 passing tests (14 new) https://claude.ai/code/session_011nTcGcn49b8YKJRVoh4TaK

1 parent 4370ddb commit 8093376Copy full SHA for 8093376

2 files changed

crates/ruvllm/src/bitnet
- backend.rs
- mod.rs

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit 8093376

File tree

0 commit comments