Commit bebe030
authored
add xpu tuning to CE (#645)
## Summary
Tuning on XPU: In cross-entropy, if device is `xpu`, set
`MAX_FUSED_SIZE` to `4096` instead of default `65536 // 2`. This gives
slightly better performance on xpu.
## Testing Done
<!--- This is a required section; please describe how this change was
tested. --->
<!--
Replace BLANK with your device type. For example, A100-80G-PCIe
Complete the following tasks before sending your PR, and replace `[ ]`
with
`[x]` to indicate you have done them.
-->
- Hardware Type: Intel(R) Data Center GPU Max 1550
- [x] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [x] run `make test-convergence` to ensure convergence1 parent bf2c67b commit bebe030
1 file changed
Lines changed: 3 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| |||
59 | 60 | | |
60 | 61 | | |
61 | 62 | | |
62 | | - | |
| 63 | + | |
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
| |||
258 | 259 | | |
259 | 260 | | |
260 | 261 | | |
261 | | - | |
| 262 | + | |
262 | 263 | | |
263 | 264 | | |
264 | 265 | | |
| |||
0 commit comments