- Common Issues
- Diagnosis Steps
- Performance Problems
- Database Issues
- Docker Issues
- GPU Issues (All Vendors)
- Debug Mode
GPU Support: SmarterRouter supports NVIDIA, AMD (ROCm), Intel Arc, and Apple Silicon GPUs. If your GPU isn't detected, check the GPU issues section below for vendor-specific troubleshooting.
Symptom: SmarterRouter cannot reach your LLM backend (Ollama, llama.cpp, etc.).
Diagnosis:
curl http://localhost:11434/api/tags # adjust port if neededShould return:
{
"models": [...]
}Solutions:
- Verify backend is running:
systemctl status ollamaordocker ps - Check
ROUTER_OLLAMA_URLin.envmatches backend URL - Test connectivity from SmarterRouter container:
docker exec smarterrouter curl http://<backend-ip>:<port>/api/tags
- For Docker networking:
- Host-only: use
host.docker.internal(Mac/Windows) or172.17.0.1(Linux) - Docker Compose: use service name (e.g.,
ollama:11434)
- Host-only: use
Symptom: Container fails to start; logs show "address already in use"
Solution:
# Check what's using the port
lsof -i :11436
# or
netstat -tulpn | grep 11436
# Option 1: Stop existing container
docker stop smarterrouter && docker rm smarterrouter
# Option 2: Change port in .env or docker-compose.yml
ROUTER_PORT=11437Symptom: Error "CRITICAL: Database path is a directory"
Cause: Docker created a directory instead of a file because the path didn't exist before mounting.
Solution:
# Stop container
docker-compose down
# Remove the erroneous directory
rm -rf router.db # or data/router.db depending on your setup
# Ensure your docker-compose.yml has correct volume mapping:
# - ./router.db:/app/router.db (single file)
# OR
# - ./data:/app/data (directory, database inside)
# Restart
docker-compose up -dSmarterRouter auto-detects GPUs from all vendors on startup. Check logs for detection messages.
Symptom: GPU monitoring disabled; VRAM not tracking.
Solution:
- Install NVIDIA drivers on host
- Install NVIDIA Container Toolkit:
# Ubuntu/Debian distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update && sudo apt-get install -y nvidia-docker2 sudo systemctl restart docker
- Test:
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
- Restart SmarterRouter with
--gpus allor--compatibility
Symptom: AMD APU (Ryzen AI, Radeon 800M series) detected with small VRAM instead of full unified memory.
Diagnosis:
# Check what's being reported
cat /sys/class/drm/card*/device/mem_info_vram_total
cat /sys/class/drm/card*/device/mem_info_gtt_total
# VRAM total should be small (< 8GB) for APUs
# GTT total should be large (~system RAM) for APUsExplanation: APUs have two memory pools:
- VRAM (mem_info_vram_*): Small BIOS carve-out (512MB-8GB)
- GTT (mem_info_gtt_*): Unified memory pool (actual usable GPU memory)
SmarterRouter auto-detects APUs (VRAM < 4GB) and uses GTT for total memory.
Solutions:
-
Check BIOS UMA Buffer - Should be set to minimum (512MB-2GB):
- Large UMA buffer wastes RAM and confuses detection
- GTT pool is the real usable memory, not VRAM
-
Manual Override if detection still fails:
# In .env ROUTER_AMD_UNIFIED_MEMORY_GB=58 # ~90% of your RAM
-
Verify ROCm architecture (for gfx1150/gfx1151):
export HSA_OVERRIDE_GFX_VERSION=11.5.1
Symptom: AMD GPU present but VRAM monitoring disabled.
Diagnosis:
# Check for rocm-smi
rocm-smi
# Check sysfs entries
ls /sys/class/drm/card*/device/mem_info_vram_totalSolutions:
- Install ROCm runtime:
# Ubuntu 22.04 sudo amdgpu-install --usecase=rocm,graphics --no-dkms sudo usermod -a -G render,video $USER
- For Docker, ensure device passthrough:
devices: - /dev/kfd - /dev/dri
- Verify AMD vendor ID in sysfs:
cat /sys/class/drm/card*/device/vendor # Should show: 0x1002 for AMD
Symptom: Intel Arc GPU present but VRAM monitoring disabled.
Diagnosis:
# Check for Intel GPU with dedicated memory
ls /sys/class/drm/card*/device/lmem_total
# Check driver loaded
lsmod | grep i915Solutions:
- Ensure Intel GPU drivers are installed (kernel 5.19+ recommended)
- For Docker, ensure device passthrough:
devices: - /dev/dri
- Verify Intel vendor ID:
cat /sys/class/drm/card*/device/vendor # Should show: 0x8086 for Intel
Note: Only Intel Arc dedicated GPUs with local memory (lmem) are supported. Integrated Intel UHD/Iris GPUs use shared system memory and are not monitored.
Symptom: Apple Silicon detected but VRAM estimate is wrong.
Solutions:
- Manually set unified memory:
# In .env ROUTER_APPLE_UNIFIED_MEMORY_GB=16 # for 16GB Mac
- Apple Silicon VRAM is estimated as 75% of total RAM by default
- Run SmarterRouter natively on macOS host (not in Docker) for accurate detection:
system_profiler SPHardwareDataType | grep "Memory:"
Important: Docker Desktop on macOS cannot pass GPU to containers. You must run SmarterRouter on the host directly.
Causes:
- Model cold-start: First time loading a model into GPU memory
- VRAM pressure triggering model unload/reload
Solutions:
- Pin a small model for fast responses:
ROUTER_PINNED_MODEL=phi3:mini
- Increase VRAM allocation if possible:
ROUTER_VRAM_MAX_TOTAL_GB=<your-gpu-total-minus-2>
- Accept that first request is always slower; subsequent requests to same model will be fast
Diagnosis:
- Check backend health:
curl http://localhost:11434/api/tags
- Check SmarterRouter logs:
docker logs smarterrouter
- Look for:
- VRAM OOM errors
- Model loading failures
- Backend connection timeouts
- Check VRAM usage:
curl http://localhost:11436/admin/vram
Common fixes:
- Reduce
ROUTER_VRAM_MAX_TOTAL_GBif set too high - Enable auto-unload:
ROUTER_VRAM_AUTO_UNLOAD_ENABLED=true - Free up GPU memory (stop other containers/processes)
- Use smaller models
Diagnosis:
-
Check model profiles:
curl http://localhost:11436/admin/profiles
- Are scores reasonable (0.0-1.0)?
- Are all models 0.0? → Profiling failed
-
Reprofile if needed:
curl -X POST "http://localhost:11436/admin/reprofile?force=true" \ -H "Authorization: Bearer your-admin-key"
-
Use explain endpoint to understand routing decision:
curl "http://localhost:11436/admin/explain?prompt=Your prompt here" -
Check complexity detection:
- Simple prompts → smaller models preferred
- Complex prompts → larger models required
- Adjust
ROUTER_QUALITY_PREFERENCEif always too aggressive/Conservative
Cause: Small ROUTER_PROFILE_TIMEOUT for large models.
Solution:
- Increase timeout in
.env:ROUTER_PROFILE_TIMEOUT=180 # 3 minutes per prompt - For very large models (14B+), consider:
ROUTER_PROFILE_TIMEOUT=300 # 5 minutes per prompt - Adaptive profiling should handle this automatically; check logs for timeout errors
Solutions:
- Unload unused models:
# Manual trigger curl -X POST http://localhost:11436/admin/cache/invalidate?all=true
- Enable aggressive auto-unload:
ROUTER_VRAM_AUTO_UNLOAD_ENABLED=true ROUTER_VRAM_UNLOAD_THRESHOLD_PCT=75 ROUTER_VRAM_UNLOAD_STRATEGY=largest # free most memory first - Reduce
ROUTER_VRAM_MAX_TOTAL_GBto be more conservative - Use smaller models
- Increase system swap (not ideal but prevents crashes)
Causes:
- Model not discovered by backend
- Backend not running or accessible
- Profiling not complete
Check:
# 1. Backend connection
curl http://localhost:11434/api/tags
# 2. SmarterRouter logs for discovery
docker logs smarterrouter | grep -i "discover\|profile"
# 3. Router health
curl http://localhost:11436/healthFix: Ensure backend is running and ROUTER_OLLAMA_URL is correct.
Symptom: Response says "Model: llama3:70b" but you expected a different model.
Explanation: This is correct behavior! SmarterRouter selected that model based on routing algorithm. To see why:
curl "http://localhost:11436/admin/explain?prompt=your prompt"Cause: Routing cache preventing re-evaluation.
Solution: Invalidate cache:
curl -X POST http://localhost:11436/admin/cache/invalidate?all=trueOr disable caching temporarily:
ROUTER_CACHE_ENABLED=falseChecklist:
- Cold start? First request to a model is slow while it loads. Subsequent requests faster.
- VRAM pressure? Check
/admin/vram- if utilization >90%, models may be unloading/reloading. - Cache enabled?
ROUTER_CACHE_ENABLED=truespeeds up repeat requests. - Pinned model? Set
ROUTER_PINNED_MODEL=phi3:minifor instant responses to simple queries. - Profiling complete? If profiles are missing or 0.0 scores, router may be making poor decisions.
Quick wins:
# Pin a small model
ROUTER_PINNED_MODEL=phi3:mini
# Increase VRAM limit
ROUTER_VRAM_MAX_TOTAL_GB=22.0
# Enable caching
ROUTER_CACHE_ENABLED=true
ROUTER_CACHE_MAX_SIZE=1000Diagnose:
curl http://localhost:11436/admin/vram | jq .Look for:
loaded_modelsarray showing which models are in memoryutilization_pct接近 100%warningsarray
Fix:
- Enable auto-unload (if not already):
ROUTER_VRAM_AUTO_UNLOAD_ENABLED=true ROUTER_VRAM_UNLOAD_THRESHOLD_PCT=80 ROUTER_VRAM_UNLOAD_STRATEGY=largest
- Manually unload specific models via admin API (coming soon) or restart
- Reduce
ROUTER_VRAM_MAX_TOTAL_GBto be more conservative - Use smaller models (7B instead of 70B)
- Quantize models (Q4_K_M, Q5_K_S)
Check logs:
docker logs smarterrouterCommon causes:
- Invalid
.envsyntax (use quotes for values with spaces) - Missing required files
- Port conflict (11436 already in use)
- Backend not accessible at startup
Fix: Read error message from logs; adjust configuration accordingly.
Symptoms: "Failed to list models: All connection attempts failed"
Solutions:
-
Both on host (not Docker):
ROUTER_OLLAMA_URL=http://localhost:11434
-
SmarterRouter in Docker, Ollama on host (Linux):
ROUTER_OLLAMA_URL=http://172.17.0.1:11434
-
Both in Docker Compose:
services: ollama: image: ollama/ollama ports: - "11434:11434" smarterrouter: build: . environment: - ROUTER_OLLAMA_URL=http://ollama:11434 # use service name
-
Test connectivity from inside container:
docker exec smarterrouter curl http://172.17.0.1:11434/api/tags
Symptom: Database resets on container restart.
Check:
# Verify volume is mounted
docker inspect smarterrouter | grep -A 10 "Mounts"Ensure docker-compose.yml has:
services:
smarterrouter:
volumes:
- ./router.db:/app/router.db # or ./data:/app/dataNote: If router.db doesn't exist on host initially, Docker will create it as a file (good). If parent directory structure doesn't exist, ensure it's created or use absolute path.
Enable verbose logging for investigation:
# In .env
ROUTER_LOG_LEVEL=DEBUG
ROUTER_LOG_FORMAT=json # easier to parse programmaticallyView logs with request tracking:
# Make request with tracking ID
curl -H "X-Request-ID: debug-123" http://localhost:11436/v1/models
# Follow logs
docker logs -f smarterrouter | grep "debug-123"
# Or for JSON logs, use jq
docker logs smarterrouter 2>&1 | jq 'select(.request_id=="debug-123")'Check detailed request logs (if enabled with --enable-request-logging flag):
logs/detailed_logs/<timestamp>/
├── request.json # Full request payload
├── response.json # Full response or error
├── streaming_chunks.jsonl # SSE chunks if streaming
└── metadata.json # Timing, model selected, scores
- Check the logs - Most issues are clearly explained there
- Run health check -
curl http://localhost:11436/health - Verify backend connectivity -
curl http://localhost:11434/api/tags - Open an issue - Include:
- SmarterRouter version (from git commit or image tag)
- Full error messages
- Relevant log snippets
- Steps to reproduce
- Your configuration (redact API keys!)