Open
Description
Describe the bug
first seen in spark-rapids-jni_nightly-dev, run:1045
This failure currently failed only in cuda12.8-arm64 test (cuda12.8 runtime image on an arm instance has 535 driver)
cudf sha: rapidsai/cudf@cf5edd0
rmm sha: rapidsai/rmm@7f0cead
[2025-03-19T04:46:13.327Z] [ERROR] testCreateAdaptors Time elapsed: 0.027 s <<< ERROR!
[2025-03-19T04:46:13.327Z] ai.rapids.cudf.CudfException: CUDA error at: /home/jenkins/agent/workspace/spark-rapids-jni_nightly-dev/target/libcudf/cmake-build/_deps/rmm-src/include/rmm/mr/device/cuda_async_memory_resource.hpp:120: cudaErrorInvalidValue invalid argument
[2025-03-19T04:46:13.327Z] at ai.rapids.cudf.Rmm.newCudaAsyncMemoryResource(Native Method)
[2025-03-19T04:46:13.327Z] at ai.rapids.cudf.RmmCudaAsyncMemoryResource.<init>(RmmCudaAsyncMemoryResource.java:46)
[2025-03-19T04:46:13.327Z] at ai.rapids.cudf.RmmCudaAsyncMemoryResource.<init>(RmmCudaAsyncMemoryResource.java:33)
[2025-03-19T04:46:13.327Z] at ai.rapids.cudf.RmmTest.testCreateAdaptors(RmmTest.java:61)
[2025-03-19T04:46:13.327Z] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2025-03-19T04:46:13.327Z] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[2025-03-19T04:46:13.327Z] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2025-03-19T04:46:13.327Z] at java.lang.reflect.Method.invoke(Method.java:498)
[2025-03-19T04:46:13.327Z] at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
Steps/Code to reproduce bug
Please provide a list of steps or a code sample to reproduce the issue.
Avoid posting private or sensitive data.
Expected behavior
A clear and concise description of what you expected to happen.
Environment details (please complete the following information)
- Environment location: [Standalone, YARN, Kubernetes, Cloud(specify cloud provider)]
- Spark configuration settings related to the issue
Additional context
Add any other context about the problem here.