Skip to content

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #9343

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #9343

Annotations

4 warnings

Build and Test OpenVINO EP (AlamLinux8, Py3.12)  /  build_test_pipeline

succeeded Feb 13, 2026 in 21m 4s