gguf_init_from_file: failed to open GGUF file './models/phi3-mini.gguf'

### 🐛 Bug Description

```
./shimmy-macos-intel serve &                  
[1] 14955
➜  Downloads 🎯 Shimmy v1.9.0
🔧 Backend: CPU (no GPU acceleration)
📦 Models: 0 available
🚀 Starting server on 127.0.0.1:11435
📦 Models: 1 available
✅ Ready to serve requests
   • POST /api/generate (streaming + non-streaming)
   • GET  /health (health check + metrics)
   • GET  /v1/models (OpenAI-compatible)

./shimmy-macos-intel list
📋 Registered Models:
  phi3-lora => "./models/phi3-mini.gguf"

✅ Total available models: 1

curl -s http://127.0.0.1:11435/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{
        "model":"REPLACE_WITH_MODEL_FROM_list",
        "messages":[{"role":"user","content":"Say hi in 5 words."}],
        "max_tokens":32
      }' | jq -r '.choices[0].message.content'
null
➜  Downloads curl -s http://127.0.0.1:11435/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{
        "model":"phi3-lora",                   
        "messages":[{"role":"user","content":"Say hi in 5 words."}],
        "max_tokens":32
      }' | jq -r '.choices[0].message.content'
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: loaded in 16.154 sec
ggml_metal_device_init: GPU name:   Intel(R) Iris(TM) Graphics 6100
ggml_metal_device_init: GPU family: MTLGPUFamilyCommon2 (3002)
ggml_metal_device_init: simdgroup reduction   = false
ggml_metal_device_init: simdgroup matrix mul. = false
ggml_metal_device_init: has unified memory    = true
ggml_metal_device_init: has bfloat            = false
ggml_metal_device_init: use residency sets    = true
ggml_metal_device_init: use shared buffers    = true
ggml_metal_device_init: recommendedMaxWorkingSetSize  =  1610.61 MB
llama_model_load_from_file_impl: using device Metal (Intel(R) Iris(TM) Graphics 6100) (unknown id) - 1536 MiB free
gguf_init_from_file: failed to open GGUF file './models/phi3-mini.gguf'
llama_model_load: error loading model: llama_model_loader: failed to load model from ./models/phi3-mini.gguf
llama_model_load_from_file_impl: failed to load model
2026-05-06T02:20:40.093140Z ERROR shimmy::openai_compat: Failed to load model 'phi3-lora': null result from llama cpp
```

### 🔄 Steps to Reproduce

As shown above.

### ✅ Expected Behavior

Decent output.

### ❌ Actual Behavior

Can not find model file.

### 📦 Shimmy Version

Latest (main branch)

### 💻 Operating System

macOS

### 📥 Installation Method

Pre-built binary from releases

### 🌍 Environment Details

_No response_

### 📋 Logs/Error Messages

```text

```

### 📝 Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gguf_init_from_file: failed to open GGUF file './models/phi3-mini.gguf' #200

🐛 Bug Description

🔄 Steps to Reproduce

✅ Expected Behavior

❌ Actual Behavior

📦 Shimmy Version

💻 Operating System

📥 Installation Method

🌍 Environment Details

📋 Logs/Error Messages

📝 Additional Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

gguf_init_from_file: failed to open GGUF file './models/phi3-mini.gguf' #200

Description

🐛 Bug Description

🔄 Steps to Reproduce

✅ Expected Behavior

❌ Actual Behavior

📦 Shimmy Version

💻 Operating System

📥 Installation Method

🌍 Environment Details

📋 Logs/Error Messages

📝 Additional Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions