Commit 366de71
authored
[TritonGPU] Tweaks to warp specialization to reduce register pressure (#6403)
* Place TMEM accumulator acquire over the entire epilogue to improve
instruction scheduling
* Give the load partition 2 warps
This marginally improves the performance of the tutorial matmul (~2.5%)
but is important for causes where spilling may occur1 parent 8de17d2 commit 366de71
4 files changed
Lines changed: 46 additions & 16 deletions
File tree
- lib/Dialect/TritonGPU/Transforms/WarpSpecialization
- test/TritonGPU
Lines changed: 14 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
381 | 381 | | |
382 | 382 | | |
383 | 383 | | |
384 | | - | |
385 | | - | |
386 | 384 | | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
387 | 394 | | |
388 | 395 | | |
| 396 | + | |
| 397 | + | |
389 | 398 | | |
390 | 399 | | |
391 | 400 | | |
392 | 401 | | |
| 402 | + | |
393 | 403 | | |
394 | 404 | | |
| 405 | + | |
| 406 | + | |
395 | 407 | | |
396 | 408 | | |
397 | 409 | | |
| |||
Lines changed: 20 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| 14 | + | |
13 | 15 | | |
14 | 16 | | |
15 | 17 | | |
| |||
182 | 184 | | |
183 | 185 | | |
184 | 186 | | |
185 | | - | |
| 187 | + | |
186 | 188 | | |
187 | 189 | | |
188 | 190 | | |
189 | 191 | | |
190 | 192 | | |
191 | 193 | | |
192 | 194 | | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
193 | 209 | | |
194 | 210 | | |
195 | 211 | | |
| |||
215 | 231 | | |
216 | 232 | | |
217 | 233 | | |
218 | | - | |
219 | | - | |
220 | | - | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
221 | 237 | | |
222 | 238 | | |
223 | 239 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
90 | | - | |
| 90 | + | |
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
484 | 484 | | |
485 | 485 | | |
486 | 486 | | |
487 | | - | |
488 | | - | |
489 | 487 | | |
490 | 488 | | |
| 489 | + | |
| 490 | + | |
491 | 491 | | |
492 | 492 | | |
493 | 493 | | |
| |||
513 | 513 | | |
514 | 514 | | |
515 | 515 | | |
516 | | - | |
| 516 | + | |
517 | 517 | | |
518 | 518 | | |
519 | 519 | | |
| |||
612 | 612 | | |
613 | 613 | | |
614 | 614 | | |
615 | | - | |
| 615 | + | |
616 | 616 | | |
617 | 617 | | |
618 | 618 | | |
| |||
682 | 682 | | |
683 | 683 | | |
684 | 684 | | |
685 | | - | |
686 | | - | |
687 | 685 | | |
688 | 686 | | |
| 687 | + | |
| 688 | + | |
689 | 689 | | |
690 | 690 | | |
691 | 691 | | |
| |||
714 | 714 | | |
715 | 715 | | |
716 | 716 | | |
717 | | - | |
| 717 | + | |
718 | 718 | | |
719 | 719 | | |
720 | 720 | | |
| |||
791 | 791 | | |
792 | 792 | | |
793 | 793 | | |
| 794 | + | |
| 795 | + | |
794 | 796 | | |
795 | | - | |
796 | 797 | | |
797 | 798 | | |
| 799 | + | |
798 | 800 | | |
799 | 801 | | |
800 | 802 | | |
| |||
0 commit comments