Commit 7b2682d
authored
2CTA Block Scale MMA with tcgen05.cp (#9460)
# 2CTA Block Scale MMA with tcgen05.cp
* **New Features**
* Functional 2-CTA Block Scale MMA with tcgen05.cp: full path (TMA → cp
→ MMA → commit) for two CTAs
* **Bug Fixes**
* TMA barrier (2CTA hang): arrive on lead CTA barrier and use .dst
shared::cluster
* TCGen05.cp (2CTA hang): use Lead CTA predicate for tcgen05.cp
instruction
* Scaled MMA 2CTA: per-CTA M/N; double M in the scale descriptor
* TCGen05.cp: only copy per-CTA scales (block=0)
* **Tests**
* test_mma_scaled_tcgen05_copy: 48 cases, parameters: num_ctas,
multicast, block size, dtype
* tritongpu_to_llvm_blackwell.mlir: tmem_copy cta_group::2, lead-CTA
predicate
# New contributor declaration
- [x] I am not making a trivial change, such as fixing a typo in a
comment.
- [x] I have written a PR description following these
[rules](https://cbea.ms/git-commit/#why-not-how).
- [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`.
- Select one of the following.
- [x] I have added tests.
- `/test` for `lit` tests
- `/unittest` for C++ tests
- `/python/test` for end-to-end tests
- [ ] This PR does not need a test because `FILL THIS IN`.
- Select one of the following.
- [ ] I have not added any `lit` tests.
- [x] The `lit` tests I have added follow these [best
practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices),
including the "tests should be minimal" section. (Usually running Python
code
and using the instructions it generates is not minimal.)1 parent cad7253 commit 7b2682d
13 files changed
Lines changed: 343 additions & 84 deletions
File tree
- include/triton/Dialect/TritonNvidiaGPU/IR
- lib/Dialect/TritonNvidiaGPU
- IR
- Transforms
- python
- src
- test/gluon
- triton/experimental/gluon/language/nvidia/blackwell
- test/Conversion
- third_party/nvidia/lib/TritonNVIDIAGPUToLLVM
- DotOpToLLVM
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
65 | 68 | | |
66 | 69 | | |
67 | 70 | | |
| |||
Lines changed: 5 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
633 | 633 | | |
634 | 634 | | |
635 | 635 | | |
636 | | - | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
637 | 639 | | |
638 | 640 | | |
639 | 641 | | |
| |||
655 | 657 | | |
656 | 658 | | |
657 | 659 | | |
| 660 | + | |
658 | 661 | | |
659 | 662 | | |
660 | 663 | | |
| |||
676 | 679 | | |
677 | 680 | | |
678 | 681 | | |
| 682 | + | |
679 | 683 | | |
680 | 684 | | |
681 | 685 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
894 | 894 | | |
895 | 895 | | |
896 | 896 | | |
897 | | - | |
| 897 | + | |
| 898 | + | |
898 | 899 | | |
899 | 900 | | |
900 | 901 | | |
901 | 902 | | |
902 | 903 | | |
903 | 904 | | |
904 | 905 | | |
905 | | - | |
| 906 | + | |
| 907 | + | |
906 | 908 | | |
907 | 909 | | |
908 | 910 | | |
| |||
1065 | 1067 | | |
1066 | 1068 | | |
1067 | 1069 | | |
1068 | | - | |
1069 | | - | |
1070 | | - | |
1071 | | - | |
1072 | | - | |
1073 | 1070 | | |
1074 | 1071 | | |
1075 | 1072 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
| 31 | + | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| |||
Lines changed: 1 addition & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
| 43 | + | |
47 | 44 | | |
48 | 45 | | |
49 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
904 | 904 | | |
905 | 905 | | |
906 | 906 | | |
907 | | - | |
908 | | - | |
| 907 | + | |
| 908 | + | |
909 | 909 | | |
910 | 910 | | |
911 | 911 | | |
912 | 912 | | |
913 | | - | |
| 913 | + | |
914 | 914 | | |
915 | 915 | | |
916 | 916 | | |
| |||
0 commit comments