Skip to content

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #9354

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #9354

Annotations

4 warnings

Build and Test OpenVINO EP (AlamLinux8, Py3.12)  /  build_test_pipeline

succeeded Feb 14, 2026 in 20m 56s