[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #8663
| Job | Run time |
|---|---|
| 17m 26s | |
| 7m 38s | |
| 10m 53s | |
| 12m 38s | |
| 7m 43s | |
| 8m 20s | |
| 10m 13s | |
| 8m 50s | |
| 9m 3s | |
| 10m 9s | |
| 1h 42m 53s |
| Job | Run time |
|---|---|
| 17m 26s | |
| 7m 38s | |
| 10m 53s | |
| 12m 38s | |
| 7m 43s | |
| 8m 20s | |
| 10m 13s | |
| 8m 50s | |
| 9m 3s | |
| 10m 9s | |
| 1h 42m 53s |