[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #8715
| Job | Run time |
|---|---|
| 7m 25s | |
| 10m 6s | |
| 17m 4s | |
| 10m 29s | |
| 8m 15s | |
| 12m 12s | |
| 7m 30s | |
| 8m 18s | |
| 8m 58s | |
| 8m 17s | |
| 1h 38m 34s |
| Job | Run time |
|---|---|
| 7m 25s | |
| 10m 6s | |
| 17m 4s | |
| 10m 29s | |
| 8m 15s | |
| 12m 12s | |
| 7m 30s | |
| 8m 18s | |
| 8m 58s | |
| 8m 17s | |
| 1h 38m 34s |