Skip to content

Commit 8093376

Browse files
committed
feat: Add GLM-4.7-Flash GGUF tensor mapping, MLA attention, and model validation
- TensorNameMapper resolves both llama.cpp (blk.*) and HuggingFace (model.layers.*) naming - MLA (Multi-Head Latent Attention) with low-rank Q/KV compression (DeepSeek-V2 style) - Stacked 3D expert tensor support (ffn_gate_exps → per-expert slicing) - Shared expert + dense layer-0 support (MoeWithShared/Dense/Moe layer types) - Updated BitNetModelConfig defaults to match GLM-4.7-Flash architecture - Tensor discovery and model validation harness for GGUF files - 188 passing tests (14 new) https://claude.ai/code/session_011nTcGcn49b8YKJRVoh4TaK
1 parent 4370ddb commit 8093376

2 files changed

Lines changed: 1582 additions & 227 deletions

File tree

0 commit comments

Comments
 (0)