RAG mode is using cuda in a ROCm system

### Issue Description

when running ramalama in fedora silverblue on AMD (strix halo) it tries to use CUDA and crash

### Steps to reproduce the issue

```sh
ramalama rag ./foo.pdf local/myrag

➜ ramalama rag ./foo.pdf local/myrag
Converting foo.pdf .   2025-10-31 12:24:24,154 - INFO - detected formats: [<InputFormat.PDF: 'pdf'>]
2025-10-31 12:24:24,162 - INFO - Going to convert document batch...
2025-10-31 12:24:24,162 - INFO - Initializing pipeline for StandardPdfPipeline with options hash 75463f421d05cb4304e1f714cf00d35d
2025-10-31 12:24:24,166 - INFO - Loading plugin 'docling_defaults'
2025-10-31 12:24:24,166 - INFO - Registered picture descriptions: ['vlm', 'api']
2025-10-31 12:24:24,170 - INFO - Loading plugin 'docling_defaults'
2025-10-31 12:24:24,170 - INFO - Registered ocr engines: ['auto', 'easyocr', 'ocrmac', 'rapidocr', 'tesserocr', 'tesseract']
Converting foo.pdf ..   2025-10-31 12:24:27,612 - INFO - Accelerator device: 'cuda:0'
Converting foo.pdf ...   2025-10-31 12:24:41,667 - INFO - Accelerator device: 'cuda:0'
Converting foo.pdf .   2025-10-31 12:24:42,157 - INFO - Processing document foo.pdf
2025-10-31 12:24:42,317 - WARNING - Encountered an error during conversion of document c33927cdfcd4b22717a49932168d476cda8d350d6348494cff58ac69822272b7:
Traceback (most recent call last)
```

### Describe the results you received

it crash with a python stack trace

### Describe the results you expected

to use ROCm to access the GPU for processing

### ramalama info output

```yaml
"mistral": "hf://lmstudio-community/Mistral-7B-Instruct-v0.3-GGUF/Mistral-7B-Instruct-v0.3-Q4_K_M.gguf",
            "mistral-small3.1": "hf://bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF/mistralai_Mistral-Small-3.1-24B-Instruct-2503-IQ2_M.gguf",
            "mistral-small3.1:24b": "hf://bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF/mistralai_Mistral-Small-3.1-24B-Instruct-2503-IQ2_M.gguf",
            "mistral:7b": "hf://lmstudio-community/Mistral-7B-Instruct-v0.3-GGUF/Mistral-7B-Instruct-v0.3-Q4_K_M.gguf",
            "mistral:7b-v1": "huggingface://TheBloke/Mistral-7B-Instruct-v0.1-GGUF/mistral-7b-instruct-v0.1.Q5_K_M.gguf",
            "mistral:7b-v2": "huggingface://TheBloke/Mistral-7B-Instruct-v0.2-GGUF/mistral-7b-instruct-v0.2.Q4_K_M.gguf",
            "mistral:7b-v3": "hf://lmstudio-community/Mistral-7B-Instruct-v0.3-GGUF/Mistral-7B-Instruct-v0.3-Q4_K_M.gguf",
            "mistral_code_16k": "huggingface://TheBloke/Mistral-7B-Code-16K-qlora-GGUF/mistral-7b-code-16k-qlora.Q4_K_M.gguf",
            "mistral_codealpaca": "huggingface://TheBloke/Mistral-7B-codealpaca-lora-GGUF/mistral-7b-codealpaca-lora.Q4_K_M.gguf",
            "mixtao": "huggingface://MaziyarPanahi/MixTAO-7Bx2-MoE-Instruct-v7.0-GGUF/MixTAO-7Bx2-MoE-Instruct-v7.0.Q4_K_M.gguf",
            "openchat": "huggingface://TheBloke/openchat-3.5-0106-GGUF/openchat-3.5-0106.Q4_K_M.gguf",
            "openorca": "huggingface://TheBloke/Mistral-7B-OpenOrca-GGUF/mistral-7b-openorca.Q4_K_M.gguf",
            "phi2": "huggingface://MaziyarPanahi/phi-2-GGUF/phi-2.Q4_K_M.gguf",
            "qwen2.5vl": "hf://ggml-org/Qwen2.5-VL-32B-Instruct-GGUF",
            "qwen2.5vl:2b": "hf://ggml-org/Qwen2.5-VL-2B-Instruct-GGUF",
            "qwen2.5vl:32b": "hf://ggml-org/Qwen2.5-VL-32B-Instruct-GGUF",
            "qwen2.5vl:3b": "hf://ggml-org/Qwen2.5-VL-3B-Instruct-GGUF",
            "qwen2.5vl:7b": "hf://ggml-org/Qwen2.5-VL-7B-Instruct-GGUF",
            "smollm:135m": "hf://HuggingFaceTB/smollm-135M-instruct-v0.2-Q8_0-GGUF",
            "smolvlm": "hf://ggml-org/SmolVLM-500M-Instruct-GGUF",
            "smolvlm:256m": "hf://ggml-org/SmolVLM-256M-Instruct-GGUF",
            "smolvlm:2b": "hf://ggml-org/SmolVLM-Instruct-GGUF",
            "smolvlm:500m": "hf://ggml-org/SmolVLM-500M-Instruct-GGUF",
            "stories-be:260k": "hf://taronaeo/tinyllamas-BE/stories260K-be.gguf",
            "tiny": "hf://TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF",
            "tinyllama": "hf://TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
        }
    },
    "Store": "/var/home/vl/.local/share/ramalama",
    "UseContainer": true,
    "Version": "0.13.0"
}
```

### Upstream Latest Release

Yes

### Additional environment details

_No response_

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RAG mode is using cuda in a ROCm system #2089

Issue Description

Steps to reproduce the issue

Describe the results you received

Describe the results you expected

ramalama info output

Upstream Latest Release

Additional environment details

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RAG mode is using cuda in a ROCm system #2089

Description

Issue Description

Steps to reproduce the issue

Describe the results you received

Describe the results you expected

ramalama info output

Upstream Latest Release

Additional environment details

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions