Skip to content

CUDA RT IB fails to build onnxruntime pkg #8571

Closed
@aandvalenzuela

Description

@aandvalenzuela

Hello,

The dedicated IB for cuda runtime fails to build onnxruntime pkg. From the latest CUDART IB log:

FAILED: CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/onnxruntime-1.14.1/onnxruntime/core/providers/cuda/math/binary_elementwise_ops_impl.cu.o 
/data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc11/external/cuda/11.5.2-2a1e4dd2237c71998d9badc1052421af/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DUSE_CUDA=1 -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/onnxruntime-1.14.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/onnxruntime-1.14.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/onnxruntime-1.14.1/onnxruntime -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/protobuf-src/src -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc11/external/cudnn/8.8.0.121-b294749e5f0cb76cc4ef362a1d43fd69/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc11/external/cuda/11.5.2-2a1e4dd2237c71998d9badc1052421af/targets/x86_64-linux/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/build/_deps/mp11-src/include -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG --generate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] -Xcompiler=-fPIC --diag-suppress 554 --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads "" -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/onnxruntime-1.14.1/onnxruntime/core/providers/cuda/math/binary_elementwise_ops_impl.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/onnxruntime-1.14.1/onnxruntime/core/providers/cuda/math/binary_elementwise_ops_impl.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/onnxruntime-1.14.1/onnxruntime/core/providers/cuda/math/binary_elementwise_ops_impl.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/onnxruntime/1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3/onnxruntime-1.14.1/onnxruntime/core/providers/cuda/math/binary_elementwise_ops_impl.cu.o
/data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/include/c++/11.4.1/bits/std_function.h:435:145: error: parameter packs not expanded with '...':
  435 |         function(_Functor&& __f)
      |                                                                                                                                                 ^ 
/data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/include/c++/11.4.1/bits/std_function.h:435:145: note:         '_ArgTypes'
/data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/include/c++/11.4.1/bits/std_function.h:530:146: error: parameter packs not expanded with '...':
  530 |         operator=(_Functor&& __f)
      |                                                                                                                                                  ^ 
/data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/include/c++/11.4.1/bits/std_function.h:530:146: note:         '_ArgTypes'
ninja: build stopped: subcommand failed.
error: Bad exit status from /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/rpm-tmp.NqgYhJ (%build)


RPM build errors:
    line 37: It's not recommended to have unversioned Obsoletes: Obsoletes: external+onnxruntime+1.14.1-e4f32e7ee87ac6022c8ef3aa4cfda0b3
    Macro expanded in comment on line 350: %{pkginstroot}/${PYTHON3_LIB_SITE_PACKAGES}

    Bad exit status from /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/rpm-tmp.NqgYhJ (%build)

It seems this issue raised as of the gcc update to 11.4.1 (#8545).
I think it is the same issue exposed in NVIDIA/nccl#650 (comment). I am moving forward to apply the proposed patch to see if it solves the issue.

FYI, @smuzaffar @fwyzard

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions