Skip to content

feat(shared): add moonshot model family support#2430

Open
narcilee7 wants to merge 1 commit into
web-infra-dev:mainfrom
narcilee7:feat/moonshot-model-support
Open

feat(shared): add moonshot model family support#2430
narcilee7 wants to merge 1 commit into
web-infra-dev:mainfrom
narcilee7:feat/moonshot-model-support

Conversation

@narcilee7
Copy link
Copy Markdown
Contributor

Summary

Add support for Moonshot AI (Kimi) vision models in Midscene.js.

Changes

  • Add moonshot to TModelFamily type and MODEL_FAMILY_VALUES array
  • Add MIDSCENE_USE_MOONSHOT legacy environment variable for quick configuration
  • Add moonshot mapping in legacyConfigToModelFamily
  • Add unit tests for model family validation and legacy config parsing

Why

Moonshot API is fully OpenAI-compatible, so it can reuse Midscene's existing VLM calling chain. However, registering it as a first-class modelFamily ensures:

  1. Planning prompt includes bbox: When modelFamily is set, the planning prompt correctly describes the locate param as {bbox: [xmin, ymin, xmax, ymax], prompt: string}, allowing the model to return coordinates directly in the planning phase.
  2. Avoids extra locate API call: Without a model family, the planning prompt only asks for {prompt: string}, forcing a separate locate call afterward.
  3. Clean user configuration: Users can explicitly set MIDSCENE_MODEL_FAMILY=moonshot or MIDSCENE_USE_MOONSHOT=1.

Usage

export MIDSCENE_MODEL_FAMILY="moonshot"
export MIDSCENE_MODEL_NAME="kimi-k2.5"
export MIDSCENE_MODEL_BASE_URL="https://api.moonshot.cn/v1"
export MIDSCENE_MODEL_API_KEY="your-api-key"

Or via legacy flag:

export MIDSCENE_USE_MOONSHOT=1
export MIDSCENE_MODEL_NAME="kimi-k2.5"
export MIDSCENE_MODEL_BASE_URL="https://api.moonshot.cn/v1"
export MIDSCENE_MODEL_API_KEY="your-api-key"

Validation

  • pnpm run lint
  • npx nx test @midscene/shared (325 passed)
  • npx nx test @midscene/core (817 passed)

Add support for Moonshot AI (Kimi) vision models:
- Add 'moonshot' to TModelFamily and MODEL_FAMILY_VALUES
- Add MIDSCENE_USE_MOONSHOT legacy env variable
- Add unit tests for model family validation and config parsing

Moonshot API is OpenAI-compatible, so it reuses the existing VLM
calling chain with standard 0-1000 normalized bbox coordinates.

Validation:
- pnpm run lint
- npx nx test @midscene/shared
- npx nx test @midscene/core
@hukz37
Copy link
Copy Markdown

hukz37 commented May 17, 2026

@narcilee7 大佬请问一下,增加了配置后,具体效果怎么样?我本次运行试了一下,基本识别不到呢,还需要配置别的什么吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants