[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #8704
| Job | Run time |
|---|---|
| 17m 9s | |
| 10m 20s | |
| 7m 19s | |
| 7m 45s | |
| 12m 29s | |
| 8m 19s | |
| 9m 59s | |
| 10m 6s | |
| 8m 51s | |
| 8m 39s | |
| 1h 40m 56s |
| Job | Run time |
|---|---|
| 17m 9s | |
| 10m 20s | |
| 7m 19s | |
| 7m 45s | |
| 12m 29s | |
| 8m 19s | |
| 9m 59s | |
| 10m 6s | |
| 8m 51s | |
| 8m 39s | |
| 1h 40m 56s |