Skip to content

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #10427

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #10427

Triggered via pull request February 12, 2026 07:16
Status Cancelled
Total duration 5m 4s
Artifacts
build_x64_release_xnnpack
2m 32s
build_x64_release_xnnpack
Fit to window
Zoom out
Zoom in

Annotations

3 errors
build_x64_release_xnnpack
Canceling since a higher priority waiting request for windows_x64_release_xnnpack-refs/pull/27321/merge exists
build_x64_release_xnnpack
The operation was canceled.
windows_x64_release_xnnpack
Canceling since a higher priority waiting request for windows_x64_release_xnnpack-refs/pull/27321/merge exists