Skip to content

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #9339

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #9339

Annotations

4 warnings

Build and Test OpenVINO EP (AlamLinux8, Py3.12)  /  build_test_pipeline

succeeded Feb 13, 2026 in 31m 7s