[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #50649
Triggered via pull request
February 14, 2026 04:12
Status
Success
Total duration
31m 31s
Artifacts
–
lint.yml
on: pull_request
Optional Lint
40s
Python format
2m 2s
Optional Lint C++
31m 23s
Annotations
1 error and 15 warnings
|
Optional Lint C++
reviewdog: Too many results (annotations) in diff.
You may miss some annotations due to GitHub limitation for annotation created by logging command.
Please check GitHub Actions log console to see all results.
Limitation:
- 10 warning annotations and 10 error annotations per step
- 50 annotations per job (sum of annotations from all the steps)
- 50 annotations per run (separate from the job annotations, these annotations aren't created by users)
Source: https://github.com/orgs/community/discussions/26680#discussioncomment-3252835
|
|
Python format
CodeQL Action v3 will be deprecated in December 2026. Please update all occurrences of the CodeQL Action in your workflow files to v4. For more information, see https://github.blog/changelog/2025-10-28-upcoming-deprecation-of-codeql-action-v3/
|
|
Python format
The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/
|
|
Python format
The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/
|
|
Python format
The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/
|
|
Python format
The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/xqa/xqa_loader_bf16_fp8_128.cu#L8
[cpplint] reported by reviewdog 🐶
Include the directory when naming header files [build/include_subdir] [4]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/xqa/xqa_loader_bf16_fp8_128.cu:8: Include the directory when naming header files [build/include_subdir] [4]
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qkv.cuh#L355
[cpplint] reported by reviewdog 🐶
If an else has a brace on one side, it should have it on both [readability/braces] [5]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qkv.cuh:355: If an else has a brace on one side, it should have it on both [readability/braces] [5]
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qkv.cuh#L222
[cpplint] reported by reviewdog 🐶
If/else bodies with multiple statements require braces [readability/braces] [4]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qkv.cuh:222: If/else bodies with multiple statements require braces [readability/braces] [4]
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qkv.cuh#L222
[cpplint] reported by reviewdog 🐶
If an else has a brace on one side, it should have it on both [readability/braces] [5]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qkv.cuh:222: If an else has a brace on one side, it should have it on both [readability/braces] [5]
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh#L290
[cpplint] reported by reviewdog 🐶
If/else bodies with multiple statements require braces [readability/braces] [4]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh:290: If/else bodies with multiple statements require braces [readability/braces] [4]
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh#L290
[cpplint] reported by reviewdog 🐶
If an else has a brace on one side, it should have it on both [readability/braces] [5]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh:290: If an else has a brace on one side, it should have it on both [readability/braces] [5]
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh#L283
[cpplint] reported by reviewdog 🐶
Using C-style cast. Use static_cast<int64_t>(...) instead [readability/casting] [4]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh:283: Using C-style cast. Use static_cast<int64_t>(...) instead [readability/casting] [4]
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh#L282
[cpplint] reported by reviewdog 🐶
Using C-style cast. Use static_cast<int64_t>(...) instead [readability/casting] [4]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh:282: Using C-style cast. Use static_cast<int64_t>(...) instead [readability/casting] [4]
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh#L281
[cpplint] reported by reviewdog 🐶
Using C-style cast. Use static_cast<int64_t>(...) instead [readability/casting] [4]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh:281: Using C-style cast. Use static_cast<int64_t>(...) instead [readability/casting] [4]
|
|
Optional Lint C++:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh#L158
[cpplint] reported by reviewdog 🐶
If an else has a brace on one side, it should have it on both [readability/braces] [5]
Raw Output:
onnxruntime/contrib_ops/cuda/bert/group_query_attention_qdq.cuh:158: If an else has a brace on one side, it should have it on both [readability/braces] [5]
|