feat: 添加独立视觉模型配置，支持fallback到主LLM by whyovo · Pull Request #2 · open-vela/packages_ai_agent

whyovo · 2026-04-14T16:27:53Z

概述

新增 set_vision_llm 命令，允许为图片分析配置独立的视觉模型，同时保留文本模型用于对话。这样的话就可以文本用mimo-v2-flash，图片才用omni，降低成本。而且也可以参考miloco，单独在服务器布置视觉模型，比如miloco-vl-7b，这样可以保护图片的隐私，文本可以调用性能更高更快的api。当未配置视觉多模态模型或者运用set_vision_llm clear删除后，视觉理解依然使用原本的文本接口，保证原功能不变。
修复 mimo preset 模型名从 MiMo-v2-Flash 改为 mimo-v2-flash（全小写），适配 api.xiaomimimo.com 的要求

改动文件

文件	改动内容
`include/agent_config.h`	新增 `AGENT_CFG_KEY_VISION_MODEL/HOST/API_KEY` 配置 key
`src/llm/llm_proxy.c`	新增 vision 静态变量、`llm_snapshot_vision_config()`（带 fallback）、`llm_set_vision_model()` setter
`src/llm/llm_proxy.h`	声明 `llm_snapshot_vision_config` 和 `llm_set_vision_model`
`src/llm/llm_vision.c`	`llm_chat_vision` 和 `llm_chat_vision_raw` 改用 vision 专用配置
`src/channels/cmd_llm.c`	新增 `cmd_set_vision_llm`（含 mimo/openai/qwen/glm 四个 preset），修复 mimo 模型名
`src/channels/cmd_llm.h`	声明 `cmd_set_vision_llm`
`src/channels/nsh_commands.c`	注册命令、更新 help 文本和 config_show 显示

工作原理

未配置 vision_model 时：视觉调用（analyze_image、camera_capture）自动 fallback 到主 LLM 配置 — 完全向后兼容
已配置 vision_model 时：视觉调用使用独立配置（可指定不同的 host、模型、API key）
对话始终使用主 LLM，不受影响

使用方式

vela> set_vision_llm mimo <api_key>         # 视觉用小米 mimo-v2-omni
vela> set_vision_llm openai <api_key>       # 视觉用 OpenAI gpt-4o
vela> set_vision_llm qwen <api_key>         # 视觉用通义千问 qwen-vl-max
vela> set_vision_llm clear                  # 清除配置，回归主 LLM
vela> config_show                           # 会显示 Vision Model/Host/Key

github-actions · 2026-04-14T16:28:09Z

❌ CLA Signature Required

@whyovo Some contributors need to sign the CLA:

1914457309@qq.com ❌ Needs to sign CLA

Please:

Sign the CLA at: https://www.openvela.com/#/community/cla
After signing, comment /check-cla to recheck

📋 View detailed check results: Action Run #24410717863

💡 Tip: All contributors must sign the CLA before the PR can be merged.

whyovo · 2026-04-14T16:35:22Z

/check-cla

github-actions · 2026-04-14T16:35:38Z

✅ CLA Verification Complete

@whyovo All contributors have signed the CLA!

1914457309@qq.com ✅

📋 View detailed check results: Action Run #24411063506

Your pull request can now proceed with the review process! 🎉

Add set_vision_llm command allowing users to configure a separate vision-capable model (e.g. mimo-v2-omni, gpt-4o, qwen-vl-max) for image analysis while keeping a cheaper text model for chat. When vision_model is not configured, vision calls automatically fall back to the main LLM config, maintaining full backward compatibility. Changes: - agent_config.h: add AGENT_CFG_KEY_VISION_* config keys - llm_proxy.c: add vision static vars, snapshot, setter with fallback - llm_proxy.h: declare llm_snapshot_vision_config, llm_set_vision_model - llm_vision.c: use vision-specific config in both vision entry points - cmd_llm.c: add cmd_set_vision_llm with 4 presets (mimo/openai/qwen/glm) - nsh_commands.c: register command, update help text and config_show Also fix mimo preset model name from MiMo-v2-Flash to mimo-v2-flash (lowercase) as required by api.xiaomimimo.com. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

skyxiaobai · 2026-04-15T02:52:10Z

+        strncpy(s_vision_host, host, sizeof(s_vision_host) - 1);
+        s_vision_host[sizeof(s_vision_host) - 1] = '\0';
+    } else {
+        config_del(AGENT_CFG_KEY_VISION_HOST);


config_del not public api, please use claw_config_set

skyxiaobai · 2026-04-15T02:53:47Z

+
+    pthread_mutex_unlock(&s_llm_lock);
+
+    syslog(LOG_INFO, "[%s] Vision LLM config updated: model=%s host=%s\n",


syslog should move to unlock before

- Replace config_del with claw_config_set(key, "") per review feedback, using the public API instead of the non-public config_del function - Move syslog before pthread_mutex_unlock to avoid reading s_vision_model/s_vision_host after lock release Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

whyovo · 2026-04-15T15:25:28Z

Thank you for your feedback! I've updated the code accordingly.

whyovo requested review from TangMeng12 and skyxiaobai as code owners April 14, 2026 16:27

whyovo force-pushed the dev branch from 4667bb9 to f5bd168 Compare April 14, 2026 16:38

whyovo force-pushed the dev branch from f5bd168 to 433bd0b Compare April 14, 2026 16:39

skyxiaobai reviewed Apr 15, 2026

View reviewed changes

whyovo force-pushed the dev branch from 2f6a0a3 to 433bd0b Compare April 15, 2026 15:17

skyxiaobai approved these changes Apr 16, 2026

View reviewed changes

TangMeng12 approved these changes Apr 16, 2026

View reviewed changes

TangMeng12 merged commit 2b48a9d into open-vela:dev Apr 16, 2026
5 of 8 checks passed

This was referenced Apr 16, 2026

[BUG] <title>cmd_llm.c第50行mimo模型名错误 #1

Closed

[Build/LLM] Incorrect build command in documentation, CMake conflict, and LLM unable to respond properly #3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: 添加独立视觉模型配置，支持fallback到主LLM#2

feat: 添加独立视觉模型配置，支持fallback到主LLM#2
TangMeng12 merged 2 commits into
open-vela:devfrom
whyovo:dev

whyovo commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

whyovo commented Apr 14, 2026

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

skyxiaobai Apr 15, 2026

Uh oh!

skyxiaobai Apr 15, 2026

Uh oh!

whyovo commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		pthread_mutex_unlock(&s_llm_lock);

		syslog(LOG_INFO, "[%s] Vision LLM config updated: model=%s host=%s\n",

Conversation

whyovo commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

概述

改动文件

工作原理

使用方式

Uh oh!

github-actions Bot commented Apr 14, 2026

❌ CLA Signature Required

Uh oh!

whyovo commented Apr 14, 2026

Uh oh!

github-actions Bot commented Apr 14, 2026

✅ CLA Verification Complete

Uh oh!

skyxiaobai Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

skyxiaobai Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

whyovo commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

whyovo commented Apr 14, 2026 •

edited

Loading