[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #8715

Annotations

3 warnings

stderr: + python3 tools/ci_build/build.py --build_dir build/MinSizeRel --config MinSizeRel --cmake_generator Ninja --skip_submodule_sync --build_shared_lib --parallel --use_vcpkg --use_vcpkg_ms_internal_asset_cache --enable_onnx_tests --cmake_generator Ninja --use_binskim_compliant_compile_flags --minimal_build extended --disable_exceptions --disable_ml_ops --skip_tests --enable_reduced_operator_type_support --disable_types sparsetensor optional float4 float8 --include_ops_by_config /onnxruntime_src/build/.test_data/include_no_operators.config --cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF --update 2026-02-14 04:13:51,029 build [DEBUG] - Command line arguments: --build_dir build/MinSizeRel --config MinSizeRel --cmake_generator Ninja --skip_submodule_sync --build_shared_lib --parallel --use_vcpkg --use_vcpkg_ms_internal_asset_cache --enable_onnx_tests --cmake_generator Ninja --use_binskim_compliant_compile_flags --minimal_build extended --disable_exceptions --disable_ml_ops --skip_tests --enable_reduced_operator_type_support --disable_types sparsetensor optional float4 float8 --include_ops_by_config /onnxruntime_src/build/.test_data/include_no_operators.config --cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF --update 2026-02-14 04:13:51,032 build [INFO] - Build started 2026-02-14 04:13:51,045 build [INFO] - Generating CMake build tree 2026-02-14 04:13:51,056 build [INFO] - /usr/bin/cmake /onnxruntime_src/cmake -Donnxruntime_ENABLE_EXTERNAL_CUSTOM_OP_SCHEMAS=OFF -Donnxruntime_RUN_ONNX_TESTS=ON -Donnxruntime_GENERATE_TEST_REPORTS=ON -DPython_EXECUTABLE=/usr/bin/python3 -Donnxruntime_USE_VCPKG=ON -Donnxruntime_USE_MIMALLOC=OFF -Donnxruntime_ENABLE_PYTHON=OFF -Donnxruntime_BUILD_CSHARP=OFF -Donnxruntime_BUILD_JAVA=OFF -Donnxruntime_BUILD_NODEJS=OFF -Donnxruntime_BUILD_OBJC=OFF -Donnxruntime_BUILD_SHARED_LIB=ON -Donnxruntime_BUILD_APPLE_FRAMEWORK=OFF -Donnxruntime_USE_DNNL=OFF -Donnxruntime_USE_NNAPI_BUILTIN=OFF -Donnxruntime_USE_VSINPU=OFF -Donnxruntime_USE_RKNPU=OFF -Donnxruntime_ENABLE_MICROSOFT_INTERNAL=OFF -Donnxruntime_USE_VITISAI=OFF -Donnxruntime_USE_TENSORRT=OFF -Donnxruntime_USE_NV=OFF -Donnxruntime_USE_TENSORRT_BUILTIN_PARSER=ON -Donnxruntime_USE_TENSORRT_INTERFACE=OFF -Donnxruntime_USE_CUDA_INTERFACE=OFF -Donnxruntime_USE_NV_INTERFACE=OFF -Donnxruntime_USE_OPENVINO_INTERFACE=OFF -Donnxruntime_USE_VITISAI_INTERFACE=OFF -Donnxruntime_USE_QNN_INTERFACE=OFF -Donnxruntime_USE_MIGRAPHX_INTERFACE=OFF -Donnxruntime_USE_MIGRAPHX=OFF -Donnxruntime_DISABLE_CONTRIB_OPS=OFF -Donnxruntime_DISABLE_ML_OPS=ON -Donnxruntime_DISABLE_RTTI=ON -Donnxruntime_DISABLE_EXCEPTIONS=ON -Donnxruntime_MINIMAL_BUILD=ON -Donnxruntime_EXTENDED_MINIMAL_BUILD=ON -Donnxruntime_MINIMAL_BUILD_CUSTOM_OPS=OFF -Donnxruntime_REDUCED_OPS_BUILD=ON -Donnxruntime_CLIENT_PACKAGE_BUILD=OFF -Donnxruntime_BUILD_MS_EXPERIMENTAL_OPS=OFF -Donnxruntime_ENABLE_LTO=OFF -Donnxruntime_USE_ACL=OFF -Donnxruntime_USE_ARMNN=OFF -Donnxruntime_ARMNN_RELU_USE_CPU=ON -Donnxruntime_ARMNN_BN_USE_CPU=ON -Donnxruntime_USE_JSEP=OFF -Donnxruntime_USE_WEBGPU=OFF -Donnxruntime_USE_EXTERNAL_DAWN=OFF -Donnxruntime_WGSL_TEMPLATE=static -Donnxruntime_ENABLE_NVTX_PROFILE=OFF -Donnxruntime_ENABLE_TRAINING=OFF -Donnxruntime_ENABLE_TRAINING_OPS=OFF -Donnxruntime_ENABLE_TRAINING_APIS=OFF -Donnxruntime_ENABLE_CPU_FP16_OPS=OFF -Donnxruntime_USE_NCCL=OFF -Donnxruntime_BUILD_BENCHMARKS=OFF -Donnxruntime_GCOV_COVERAGE=OFF -Donnxruntime_ENABLE_MEMORY_PROFILE=OFF -Donnxruntime_ENABLE_CUDA_LINE_NUMBER_INFO=OFF -Donnxruntime_USE_CUDA_NHWC_OPS=OFF -Donnxruntime_BUILD_WEBASSEMBLY_STATIC_LIB=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_EXCEPTION_CATCHING=ON -Donnxruntime_ENABLE_WEBASSEMBLY_API_EXCEPTION_CATCHING=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_EXCEPTION_THROWING=ON -Donnxruntime_WEBASSEMBLY_RUN_TESTS_IN_BROWSER=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_JSPI=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_THREADS=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_DEBUG_INFO=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_PROFILING=OFF -Donnxruntime_ENABLE_LAZY_TENSOR=OFF -Donnxruntime_ENABLE_CUDA_PROFILING=OFF -Donnxruntime_USE_XNNPACK=O

Get Docker Image using Action

stderr: WARNING! Your credentials are stored unencrypted in '/home/cloudtest/.docker/config.json'. Configure a credential helper to remove this warning. See https://docs.docker.com/go/credential-store/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #8715

6c. Build Extended Minimal (No Optional Features)