[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #20665
| Job | Run time |
|---|---|
| 3m 1s | |
| 36m 11s | |
| 52m 36s | |
| 59m 58s | |
| 1h 43m 14s | |
| 51m 14s | |
| 1h 13m 48s | |
| 58m 39s | |
| 44m 36s | |
| 59m 20s | |
| 40m 43s | |
| 49m 49s | |
| 10h 33m 9s |
| Job | Run time |
|---|---|
| 3m 1s | |
| 36m 11s | |
| 52m 36s | |
| 59m 58s | |
| 1h 43m 14s | |
| 51m 14s | |
| 1h 13m 48s | |
| 58m 39s | |
| 44m 36s | |
| 59m 20s | |
| 40m 43s | |
| 49m 49s | |
| 10h 33m 9s |