Commit 20e4392
Add PER_TOKEN_HEAD FP8 quant and P-scale to batch_prefill
Add a new FP8 quantization scheme (PER_TOKEN_HEAD, enum value 5) for the
batch_prefill FMHA kernel. Unlike PERTENSOR (single scale for all of Q/K/V)
or KV_BLOCKSCALE (per-page K/V scales), PER_TOKEN_HEAD applies fine-grained
descales:
- Q descale: per-token, per-head [total_q, nhead_q]
- K descale: per-token, per-head [num_total_pages, page_block_size, nhead_k]
- V descale: per-head [nhead_k]
The dequantization of the QK dot product is staged through LDS to avoid
inflating the inner-loop instruction footprint. Cross-page tiles
(page_block_size < kN0) are supported via per-column physical page lookup,
unlike KV_BLOCKSCALE which requires page_block_size >= kN0.
Additionally, an optional per-q-head P-scale [num_head_q] is supported.
The kernel folds log2(p_scale) into the exp2 row-max shift, so the scale
factor appears in both P and the rowsum l, cancelling in O = sum(P*V) / l
with no separate V-descale fixup needed.
Also adds page_size=64 to the codegen page size list, and includes SRD
same-page-skip optimizations for K/V window rebasing.
Changes: - block_attention_quant_scale_enum.hpp: PER_TOKEN_HEAD = 5
- quant.hpp: enum, serialize ("pth"), decode
- cpp_symbol_map.py: codegen symbol mappings
- fmha_batch_prefill.py: page_size=64, per_token_head qscale, filter update
- fmha_fwd.hpp: args struct (stride fields, p_scale_ptr), kargs forwarding
- fmha_batch_prefill_kernel.hpp: kargs struct, MakeKargs, get_scale_s,
pipeline dispatch
- block_fmha_batch_prefill_pipeline_qr_ks_vs_async.hpp: LDS-staged dequant,
p_scale_log2 exp2-shift fold, cross-page support, SRD same-page skip,
PER_TOKEN_HEAD convenience overload
Co-authored-by: Cursor <cursoragent@cursor.com>1 parent ca4930e commit 20e4392
7 files changed
Lines changed: 460 additions & 30 deletions
File tree
- projects/composablekernel
- example/ck_tile/01_fmha
- codegen
- ops
- include/ck_tile/ops/fmha
- block
- kernel
- pipeline
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
| 84 | + | |
84 | 85 | | |
85 | 86 | | |
86 | 87 | | |
| |||
89 | 90 | | |
90 | 91 | | |
91 | 92 | | |
| 93 | + | |
92 | 94 | | |
93 | 95 | | |
94 | 96 | | |
| |||
Lines changed: 7 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
51 | | - | |
| 51 | + | |
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| |||
733 | 733 | | |
734 | 734 | | |
735 | 735 | | |
736 | | - | |
| 736 | + | |
737 | 737 | | |
738 | 738 | | |
739 | 739 | | |
| |||
819 | 819 | | |
820 | 820 | | |
821 | 821 | | |
822 | | - | |
823 | | - | |
824 | | - | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
825 | 827 | | |
826 | 828 | | |
827 | 829 | | |
| |||
Lines changed: 27 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
673 | 673 | | |
674 | 674 | | |
675 | 675 | | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
| 684 | + | |
| 685 | + | |
| 686 | + | |
676 | 687 | | |
677 | 688 | | |
678 | 689 | | |
| |||
1342 | 1353 | | |
1343 | 1354 | | |
1344 | 1355 | | |
1345 | | - | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
| 1361 | + | |
| 1362 | + | |
| 1363 | + | |
1346 | 1364 | | |
1347 | 1365 | | |
1348 | 1366 | | |
| |||
1397 | 1415 | | |
1398 | 1416 | | |
1399 | 1417 | | |
1400 | | - | |
| 1418 | + | |
| 1419 | + | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
| 1424 | + | |
| 1425 | + | |
1401 | 1426 | | |
1402 | 1427 | | |
1403 | 1428 | | |
| |||
Lines changed: 12 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| 42 | + | |
| 43 | + | |
41 | 44 | | |
42 | 45 | | |
43 | 46 | | |
| |||
63 | 66 | | |
64 | 67 | | |
65 | 68 | | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
66 | 73 | | |
67 | 74 | | |
68 | 75 | | |
| |||
Lines changed: 11 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
45 | 46 | | |
46 | 47 | | |
47 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
48 | 54 | | |
49 | 55 | | |
Lines changed: 119 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
205 | 205 | | |
206 | 206 | | |
207 | 207 | | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
208 | 227 | | |
209 | 228 | | |
210 | 229 | | |
| |||
225 | 244 | | |
226 | 245 | | |
227 | 246 | | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
228 | 253 | | |
229 | 254 | | |
230 | 255 | | |
| |||
379 | 404 | | |
380 | 405 | | |
381 | 406 | | |
382 | | - | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
383 | 416 | | |
384 | 417 | | |
385 | 418 | | |
| |||
458 | 491 | | |
459 | 492 | | |
460 | 493 | | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
461 | 507 | | |
462 | 508 | | |
463 | 509 | | |
| |||
536 | 582 | | |
537 | 583 | | |
538 | 584 | | |
539 | | - | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
540 | 594 | | |
541 | 595 | | |
542 | 596 | | |
| |||
612 | 666 | | |
613 | 667 | | |
614 | 668 | | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
615 | 682 | | |
616 | 683 | | |
617 | 684 | | |
| |||
1222 | 1289 | | |
1223 | 1290 | | |
1224 | 1291 | | |
| 1292 | + | |
| 1293 | + | |
| 1294 | + | |
| 1295 | + | |
| 1296 | + | |
| 1297 | + | |
1225 | 1298 | | |
1226 | 1299 | | |
1227 | 1300 | | |
| |||
1339 | 1412 | | |
1340 | 1413 | | |
1341 | 1414 | | |
| 1415 | + | |
| 1416 | + | |
| 1417 | + | |
| 1418 | + | |
| 1419 | + | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
| 1424 | + | |
| 1425 | + | |
| 1426 | + | |
| 1427 | + | |
| 1428 | + | |
| 1429 | + | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
| 1433 | + | |
| 1434 | + | |
| 1435 | + | |
| 1436 | + | |
| 1437 | + | |
| 1438 | + | |
| 1439 | + | |
| 1440 | + | |
| 1441 | + | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
| 1445 | + | |
| 1446 | + | |
| 1447 | + | |
| 1448 | + | |
| 1449 | + | |
| 1450 | + | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
| 1454 | + | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
1342 | 1459 | | |
1343 | 1460 | | |
1344 | 1461 | | |
| |||
0 commit comments