Skip to content

[CUDA] GroupQueryAttention with XQA and Quantized KV Cache Support #10369

[CUDA] GroupQueryAttention with XQA and Quantized KV Cache Support

[CUDA] GroupQueryAttention with XQA and Quantized KV Cache Support #10369

Job Run time
1h 0m 45s
1h 0m 45s