Commit 4bc89dd
authored
Fix RoPE convention for NEOX-style models in quantized_llama.rs (huggingface#3411)
quantized_llama.rs always used rope_i (interleaved RoPE), which pairs
dimensions (2i, 2i+1). This is correct for standard Llama (rope_type=0)
but wrong for NEOX-style architectures like Qwen2, Falcon, Phi, etc.
(rope_type=2), which pair (i, i+d/2).
The wrong dimension pairing corrupts attention patterns in every layer.
Over 48 layers this compounds to +11.7 logit inflation on special tokens
vs llama.cpp reference output, causing repetition loops and degenerate
text.
The fix reads general.architecture from GGUF metadata and dispatches to
rope (non-interleaved) for NEOX-style models and rope_i (interleaved)
for NORM-style models, matching llama.cpp's llama_model_rope_type().
After the fix, logits match llama.cpp to <0.01 precision across all
152K vocab tokens after 48 layers, with identical top-20 rankings.
Fixes huggingface#34101 parent aff7c10 commit 4bc89dd
1 file changed
Lines changed: 35 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
157 | 162 | | |
158 | 163 | | |
159 | 164 | | |
| |||
175 | 180 | | |
176 | 181 | | |
177 | 182 | | |
178 | | - | |
179 | | - | |
180 | | - | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
181 | 189 | | |
182 | 190 | | |
183 | 191 | | |
| |||
333 | 341 | | |
334 | 342 | | |
335 | 343 | | |
| 344 | + | |
336 | 345 | | |
337 | 346 | | |
338 | 347 | | |
| |||
383 | 392 | | |
384 | 393 | | |
385 | 394 | | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
386 | 417 | | |
387 | 418 | | |
388 | 419 | | |
| |||
456 | 487 | | |
457 | 488 | | |
458 | 489 | | |
| 490 | + | |
459 | 491 | | |
460 | 492 | | |
461 | 493 | | |
| |||
0 commit comments