-
Notifications
You must be signed in to change notification settings - Fork 233
Description
#2192 works around an issue observed in cuDF builds. See logs:
cuDF build logs
From https://github.com/rapidsai/cudf/actions/runs/20384141895/job/58581416851?pr=20929
FAILED: tests/CMakeFiles/SPAN_TEST_DEVICE_VECTOR.dir/utilities_tests/span_tests.cu.o
/usr/bin/sccache /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/g++ -DBS_THREAD_POOL_ENABLE_PAUSE=1 -DCCCL_AVOID_SORT_UNROLL=1 -DCCCL_DISABLE_PDL -DCUB_DISABLE_NAMESPACE_MAGIC -DCUB_IGNORE_NAMESPACE_MAGIC_ERROR -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_FORCE_32_BIT_OFFSET_TYPE=1 -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -I/home/coder/cudf/cpp -I/home/coder/cudf/cpp/src -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/cuco-src/include -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/cccl-src/lib/cmake/thrust/../../../thrust -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/cccl-src/lib/cmake/libcudacxx/../../../libcudacxx/include -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/cccl-src/lib/cmake/cub/../../../cub -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/dlpack-src/include -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/jitify-src -I/home/coder/cudf/cpp/include -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/include -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rapids_logger-src/include -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rmm-src/cpp/include -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rmm-build/include -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/nvtx3-src/c/include -I/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/bs_thread_pool-src/include -isystem /usr/local/cuda/targets/x86_64-linux/include -isystem /usr/local/cuda/targets/x86_64-linux/include/cccl -isystem /home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/gtest-src/googlemock/include -isystem /home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/gtest-src/googlemock -isystem /home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/gtest-src/googletest/include -isystem /home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/gtest-src/googletest -t=7 -O3 -DNDEBUG -std=c++20 "--generate-code=arch=compute_75,code=[sm_75]" "--generate-code=arch=compute_80,code=[sm_80]" "--generate-code=arch=compute_86,code=[sm_86]" "--generate-code=arch=compute_90a,code=[sm_90a]" "--generate-code=arch=compute_100f,code=[sm_100f]" "--generate-code=arch=compute_120a,code=[sm_120a]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -Xcompiler=-fPIE --expt-extended-lambda --expt-relaxed-constexpr -Werror=all-warnings -Xcompiler=-Wall,-Werror,-Wno-error=deprecated-declarations -diag-suppress=1407 -Xfatbin=-compress-all --compress-mode=size -MD -MT tests/CMakeFiles/SPAN_TEST_DEVICE_VECTOR.dir/utilities_tests/span_tests.cu.o -MF tests/CMakeFiles/SPAN_TEST_DEVICE_VECTOR.dir/utilities_tests/span_tests.cu.o.d -x cu -c /home/coder/cudf/cpp/tests/utilities_tests/span_tests.cu -o tests/CMakeFiles/SPAN_TEST_DEVICE_VECTOR.dir/utilities_tests/span_tests.cu.o
/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rmm-src/cpp/include/rmm/detail/runtime_capabilities.hpp(95): error #20011-D: calling a __host__ function("rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > ::cccl_async_resource_ref(const rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > &)") from a __host__ __device__ function("rmm::mr::thrust_allocator<int> ::thrust_allocator") is not allowed
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
1 error detected in the compilation of "/home/coder/cudf/cpp/tests/utilities_tests/span_tests.cu".
/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rmm-src/cpp/include/rmm/detail/runtime_capabilities.hpp(95): error #20011-D: calling a __host__ function("rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > ::cccl_async_resource_ref(const rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > &)") from a __host__ __device__ function("rmm::mr::thrust_allocator<int> ::thrust_allocator") is not allowed
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
1 error detected in the compilation of "/home/coder/cudf/cpp/tests/utilities_tests/span_tests.cu".
/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rmm-src/cpp/include/rmm/detail/runtime_capabilities.hpp(95): error #20011-D: calling a __host__ function("rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > ::cccl_async_resource_ref(const rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > &)") from a __host__ __device__ function("rmm::mr::thrust_allocator<int> ::thrust_allocator") is not allowed
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
1 error detected in the compilation of "/home/coder/cudf/cpp/tests/utilities_tests/span_tests.cu".
/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rmm-src/cpp/include/rmm/detail/runtime_capabilities.hpp(95): error #20011-D: calling a __host__ function("rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > ::cccl_async_resource_ref(const rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > &)") from a __host__ __device__ function("rmm::mr::thrust_allocator<int> ::thrust_allocator") is not allowed
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
1 error detected in the compilation of "/home/coder/cudf/cpp/tests/utilities_tests/span_tests.cu".
/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rmm-src/cpp/include/rmm/detail/runtime_capabilities.hpp(95): error #20011-D: calling a __host__ function("rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > ::cccl_async_resource_ref(const rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > &)") from a __host__ __device__ function("rmm::mr::thrust_allocator<int> ::thrust_allocator") is not allowed
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
1 error detected in the compilation of "/home/coder/cudf/cpp/tests/utilities_tests/span_tests.cu".
/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rmm-src/cpp/include/rmm/detail/runtime_capabilities.hpp(95): error #20011-D: calling a __host__ function("rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > ::cccl_async_resource_ref(const rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > &)") from a __host__ __device__ function("rmm::mr::thrust_allocator<int> ::thrust_allocator") is not allowed
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
1 error detected in the compilation of "/home/coder/cudf/cpp/tests/utilities_tests/span_tests.cu".
/home/coder/cudf/cpp/build/pip/cuda-13.0/release/_deps/rmm-src/cpp/include/rmm/detail/runtime_capabilities.hpp(95): error #20011-D: calling a __host__ function("rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > ::cccl_async_resource_ref(const rmm::detail::cccl_async_resource_ref< ::cuda::mr::__4::resource_ref< ::cuda::mr::__4::device_accessible > > &)") from a __host__ __device__ function("rmm::mr::thrust_allocator<int> ::thrust_allocator") is not allowed
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
1 error detected in the compilation of "/home/coder/cudf/cpp/tests/utilities_tests/span_tests.cu".
Details from that PR:
The thrust allocator class needs to be reworked eventually anyway due to the upstream allocator being deprecated, so this workaround is acceptable for now. I'm entirely confused as to why this line is where the suppression is needed and I suspect that this is a completely false diagnostic from nvcc, but for now I think it's acceptable to add this as a workaround to get builds passing. I verified that this fixes cudf builds in a local devcontainer reproducing the issue.
The core issue is that RMM's rmm::mr::thrust_allocator inherits from thrust::device_malloc_allocator which has __host__ __device__ annotations. The docs indicate this allocator is deprecated in favor of thrust::mr memory resource-based allocators.
We should investigate whether we can/should switch. The new class thrust::mr::allocator also appears to have host-device constructors so it's unclear if this would help us or not.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status