Skip to content

Conversation

@blefo
Copy link
Member

@blefo blefo commented Nov 28, 2025

New AI model services and orchestration:

  • Added qwen3_coder_30b_gpu service to docker-compose.nilai-router-1.yml for deploying the Qwen3-Coder-30B-A3B-Instruct model with GPU support and health checks.
  • Added gpt_oss_20b_gpu and qwen3_thinking_4b_gpu services to docker-compose.nilai-router-2.yml for deploying the GPT-OSS-20B and Qwen3-4B-Thinking-2507 models, including inter-service health dependencies.
  • Added arch_router_1_5b_gpu and qwen3_vl_4b_gpu services to docker-compose.nilai-router-3.yml for deploying the Arch-Router-1.5B and Qwen3-VL-4B-Instruct models, with multimodal support and health dependencies.

@blefo blefo marked this pull request as ready for review November 28, 2025 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants