[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #9934
Annotations
3 warnings
|
Test ONNX Runtime
stderr: + PATH=/opt/python/cp310-cp310/bin:/usr/local/dotnet:/opt/rh/gcc-toolset-14/root/usr/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/lib/jvm/msopenjdk-17/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
+ python3 tools/ci_build/build.py --build_dir build/Release --config Release --cmake_generator Ninja --skip_submodule_sync --build_shared_lib --parallel --use_vcpkg --use_vcpkg_ms_internal_asset_cache --enable_onnx_tests --use_cuda --use_binskim_compliant_compile_flags --cuda_version=12.8 --cuda_home=/usr/local/cuda-12.8 --cudnn_home=/usr/local/cuda-12.8 --enable_cuda_profiling --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=90 onnxruntime_BUILD_UNIT_TESTS=ON onnxruntime_ENABLE_CUDA_EP_INTERNAL_TESTS=ON --test
2026-02-14 05:27:48,268 tools_python_utils [INFO] - flatbuffers module is not installed. parse_config will not be available
2026-02-14 05:27:48,304 build [DEBUG] - Command line arguments:
--build_dir build/Release --config Release --cmake_generator Ninja --skip_submodule_sync --build_shared_lib --parallel --use_vcpkg --use_vcpkg_ms_internal_asset_cache --enable_onnx_tests --use_cuda --use_binskim_compliant_compile_flags --cuda_version=12.8 --cuda_home=/usr/local/cuda-12.8 --cudnn_home=/usr/local/cuda-12.8 --enable_cuda_profiling --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=90 onnxruntime_BUILD_UNIT_TESTS=ON onnxruntime_ENABLE_CUDA_EP_INTERNAL_TESTS=ON --test
2026-02-14 05:27:48,307 build [INFO] - Build started
2026-02-14 05:27:48,308 build [DEBUG] - create symlink /data/models -> build/Release/models
2026-02-14 05:27:48,308 build [INFO] - Running tests for Release configuration
2026-02-14 05:27:48,308 build [INFO] - /usr/bin/ctest --build-config Release --verbose --timeout 10800
2026-02-14 05:37:04,889 build [INFO] - Build complete
|
|
Run microsoft/onnxruntime-github-actions/build-docker-image@v0.0.9
stderr: #0 building with "default" instance using docker driver
#1 [internal] load build definition from Dockerfile.manylinux2_28_cuda
#1 transferring dockerfile:
#1 transferring dockerfile: 1.88kB done
#1 DONE 0.4s
#2 [auth] internal/azureml/onnxruntime/build/cuda12_x64_almalinux8_gcc14:pull token for onnxruntimebuildcache.azurecr.io
#2 DONE 0.0s
#3 [internal] load metadata for onnxruntimebuildcache.azurecr.io/internal/azureml/onnxruntime/build/cuda12_x64_almalinux8_gcc14:20251017.1
#3 DONE 1.0s
#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s
#5 [1/6] FROM onnxruntimebuildcache.azurecr.io/internal/azureml/onnxruntime/build/cuda12_x64_almalinux8_gcc14:20251017.1@sha256:f9faa2397d114b46b5c281353e2d50ccba0ffce77fde89753bedc07217f7eff2
#5 resolve onnxruntimebuildcache.azurecr.io/internal/azureml/onnxruntime/build/cuda12_x64_almalinux8_gcc14:20251017.1@sha256:f9faa2397d114b46b5c281353e2d50ccba0ffce77fde89753bedc07217f7eff2 0.0s done
#5 ...
#6 [internal] load build context
#6 transferring context: 31.83kB 0.0s done
#6 DONE 0.2s
#5 [1/6] FROM onnxruntimebuildcache.azurecr.io/internal/azureml/onnxruntime/build/cuda12_x64_almalinux8_gcc14:20251017.1@sha256:f9faa2397d114b46b5c281353e2d50ccba0ffce77fde89753bedc07217f7eff2
#5 DONE 1.1s
#5 [1/6] FROM onnxruntimebuildcache.azurecr.io/internal/azureml/onnxruntime/build/cuda12_x64_almalinux8_gcc14:20251017.1@sha256:f9faa2397d114b46b5c281353e2d50ccba0ffce77fde89753bedc07217f7eff2
#5 sha256:695e22347fb7c112af8a97154f75ec253a6a704707330387906a0c962eaca183 0B / 322B 0.2s
#5 sha256:695e22347fb7c112af8a97154f75ec253a6a704707330387906a0c962eaca183 322B / 322B 0.3s
#5 sha256:7899b1f065e89b1c38de5889fdb6bed5d7cefc983af58530b5082c64add78991 0B / 12.53MB 0.2s
#5 sha256:4e48097af40f49731459ab5dce3e3a4cd6a1c6662a9a4661218f9117f38ecd4a 0B / 13.03MB 0.2s
#5 sha256:f32bacc35bda8f176e88329303032804f6dd0a57ec19505d32965e7f8daa5d1b 0B / 341.57kB 0.2s
#5 sha256:7899b1f065e89b1c38de5889fdb6bed5d7cefc983af58530b5082c64add78991 1.05MB / 12.53MB 0.5s
#5 sha256:f32bacc35bda8f176e88329303032804f6dd0a57ec19505d32965e7f8daa5d1b 341.57kB / 341.57kB 0.5s
#5 sha256:695e22347fb7c112af8a97154f75ec253a6a704707330387906a0c962eaca183 322B / 322B 0.7s done
#5 sha256:7899b1f065e89b1c38de5889fdb6bed5d7cefc983af58530b5082c64add78991 12.53MB / 12.53MB 0.8s
#5 sha256:f32bacc35bda8f176e88329303032804f6dd0a57ec19505d32965e7f8daa5d1b 341.57kB / 341.57kB 0.6s done
#5 sha256:7899b1f065e89b1c38de5889fdb6bed5d7cefc983af58530b5082c64add78991 12.53MB / 12.53MB 0.8s done
#5 sha256:4e48097af40f49731459ab5dce3e3a4cd6a1c6662a9a4661218f9117f38ecd4a 5.24MB / 13.03MB 0.9s
#5 sha256:cdc8085a23f343004fa5874223972150c5e86217a37e04b6f89f8b59645d8cb3 0B / 311B 0.2s
#5 sha256:4564c69d9d5fab85b558f5c755765aef0e47996f677c8dc1b893dc2121029a86 0B / 56.55MB 0.2s
#5 sha256:6f491630317942bc72b01405eba4e4fd9bafbd89da5a0c8a3344522f9e7b1b26 634B / 634B 0.2s done
#5 sha256:4e48097af40f49731459ab5dce3e3a4cd6a1c6662a9a4661218f9117f38ecd4a 8.39MB / 13.03MB 1.1s
#5 sha256:cdc8085a23f343004fa5874223972150c5e86217a37e04b6f89f8b59645d8cb3 311B / 311B 0.4s done
#5 sha256:f0763b7a2035ebd7aaa92325e7a6a26b84938cbfaf625eae80f3e248a5edbd4d 0B / 105.51MB 0.2s
#5 sha256:4e48097af40f49731459ab5dce3e3a4cd6a1c6662a9a4661218f9117f38ecd4a 10.49MB / 13.03MB 1.2s
#5 sha256:4564c69d9d5fab85b558f5c755765aef0e47996f677c8dc1b893dc2121029a86 4.19MB / 56.55MB 0.6s
#5 sha256:75541d4d3243d3f4e1155f7746a13f86c7f28d74ecdd702d07899dc9ea6e2e31 0B / 767.93kB 0.2s
#5 sha256:4e48097af40f49731459ab5dce3e3a4cd6a1c6662a9a4661218f9117f38ecd4a 13.03MB / 13.03MB 1.4s done
#5 sha256:4564c69d9d5fab85b558f5c755765aef0e47996f677c8dc1b893dc2121029a86 12.58MB / 56.55MB 0.8s
#5 sha256:4564c69d9d5fab85b558f5c755765aef0e47996f677c8dc1b893dc2121029a86 20.97MB / 56.55MB 0.9s
#5 sha256:75541d4d3243d3f4e1155f7746a13f86c7f28d74ecdd702d07899dc9ea6e2e31 767.93kB / 767.93kB 0.4s done
#5 sha256:2cd82f876e5f0cfebcbb33a2e32b78cd5c4cd6f9209b5f79d30f94a330652e77 0B / 12.36MB 0.2s
#5 sha256:4564c69d9d5fab85b558f5c755765aef0e47996f677c8dc1b893dc2121029a86 29.36MB / 56.55MB 1.1s
#5 sha25
|
|
Run microsoft/onnxruntime-github-actions/build-docker-image@v0.0.9
stderr: WARNING! Your credentials are stored unencrypted in '/home/cloudtest/.docker/config.json'.
Configure a credential helper to remove this warning. See
https://docs.docker.com/go/credential-store/
|
Loading