Error While Creating ONNX Session with CUDA Execution Provider

### Describe the issue

We are encountering an issue while creating an ONNX session using the CUDA Execution Provider in a Kubernetes (k8s) environment.

### Context
We are performing GPU-based inferencing with ONNX Runtime using the CUDA and TensorRT providers in our C++ application. The shared ONNX Runtime (ORT) libraries are being used.

### Code Snippet
Here’s the relevant code:
``` 
// Initialize the ONNX Runtime environment
auto env = std::make_unique<Ort::Env>(ORT_LOGGING_LEVEL_VERBOSE, "InferenceUtil");

//Set up Ort session options
Ort::SessionOptions session_options;

OrtCUDAProviderOptions cuda_options = {};
session_options.AppendExecutionProvider_CUDA(cuda_options);

session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_DISABLE_ALL); 

 // Load the ONNX model from the specified path
const std::string model_path_str = details::model_path(); // Fetch the model path

session = std::make_unique<Ort::Session>(*env, model_path_str.c_str(), session_options);
```

CMAKE File:

```
 target_link_libraries(${PROJECT_NAME}
    -L${IMPORT_DIR_ONNX}/cuda/lib64 -lcudart
    -L${IMPORT_DIR_ONNX}/cuda/lib64 -lcudnn
    -L${IMPORT_DIR_ONNX}/onnxruntime-linux-x64-gpu-1.18.0/lib -lonnxruntime
  )
```

### Error Logs

>CUDA failure 100: no CUDA-capable device is detected ;

```bash
initInfer() :: Exception caught while initializing inference: /tmp/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:123 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /tmp/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:116 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 100: no CUDA-capable device is detected ; GPU=0 ; hostname=algorithm-runner-6c467684d-wqrnb ; file=/tmp/onnxruntime/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=280 ; expr=cudaSetDevice(info_.device_id);"
```

### Environment Details

- ONNX Runtime Version: 1.18.0
- CUDA Version: 12.2
- cuDNN Version: 8.9.7
- OS: SLES 15.5 , Running on a Kubernetes environment

- Hardware:
  - GPU: Tesla T4
  - Driver Version: 550.107.02
  - CUDA Version from nvidia-smi: 12.4

### Diagnostics

Output of `nvcc --version`:

```
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
```

Output Of `nvidia-smi -L`:
>GPU 0: Tesla T4 (UUID: GPU-06b02b25-f8bb-b475-f628-805b3984d63f)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error While Creating ONNX Session with CUDA Execution Provider #22980

Describe the issue

Context

Code Snippet

Error Logs

Environment Details

Diagnostics

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error While Creating ONNX Session with CUDA Execution Provider #22980

Description

Describe the issue

Context

Code Snippet

Error Logs

Environment Details

Diagnostics

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions