Skip to content

ci(model): validate GLM provider via real API calls and add mock unit…#1472

Open
Jinhaooo wants to merge 1 commit intoagentscope-ai:v2_devfrom
Jinhaooo:ci/test-glm-chat-model
Open

ci(model): validate GLM provider via real API calls and add mock unit…#1472
Jinhaooo wants to merge 1 commit intoagentscope-ai:v2_devfrom
Jinhaooo:ci/test-glm-chat-model

Conversation

@Jinhaooo
Copy link
Copy Markdown

@Jinhaooo Jinhaooo commented Apr 13, 2026

AgentScope Version

2.0.0

Description

Background

Partial contribution to #1447. The chat model interfaces have been adjusted in 2.0.0 and need to be validated via real API calls before building mock-based unit tests. This PR covers the GLM (Zhipu AI) provider, which is accessed through OpenAIChatModel as an OpenAI-compatible endpoint.

Validation Results (Real API Calls)

Dimension Model Result Notes
Non-streaming glm-4-flash ✅ PASSED Response structure and usage tokens correct
Streaming glm-4-flash ✅ PASSED Chunks iterated and assembled correctly
Reasoning model glm-4.7-flash ✅ PASSED enable_thinking passthrough works, reasoning_content parsed as ThinkingBlock
Multimodal input glm-4v-flash ✅ PASSED Image URL input accepted, text description returned
Multimodal output ⏭ SKIPPED GLM does not support multimodal output

Conclusion: OpenAIChatModel works out of the box with GLM. No framework-level issues found. Other GLM models (glm-5, glm-5.1, etc.) share the same OpenAI-compatible protocol and are expected to behave identically.

Changes

  • scripts/smoke_test/test_glm_chat_model.py — real API validation script (requires ZAI_API_KEY)
  • scripts/smoke_test/fixtures/glm_*.json — captured real API responses as mock data source
  • tests/unit/test_glm_chat_model_mock.py — mock unit tests for CI (no API key needed)

How to Test

# Smoke tests (requires real API key)
ZAI_API_KEY=your-key pytest scripts/smoke_test/test_glm_chat_model.py -v

# Unit tests (no API key needed)
pytest tests/unit/test_glm_chat_model_mock.py -v

Checklist

  • Code has been formatted with pre-commit run --all-files command
  • All tests are passing
  • Docstrings are in Google style
  • Related documentation has been updated (e.g. links, examples, etc.)
  • Code is ready for review

@DavdGao
Copy link
Copy Markdown
Member

DavdGao commented Apr 17, 2026

@Jinhaooo Thank you for your contribution to 2.0! However, please note that our active development branch for 2.0 is v2_dev rather than main. Could you rebase and redirect this PR to v2_dev?

Additionally, regarding the test implementation — would you consider placing it under a new scripts/smoke_test/ directory in the project root, rather than in the tests/ directory?

The reason is that tests placed in tests/ may be unintentionally triggered by developers who have the required environment variables set locally, without realizing that running these tests would consume API tokens and potentially incur unexpected costs. By moving it to scripts/smoke_test/, it becomes much clearer to anyone running the script that it is intended for live API verification rather than routine unit testing, which helps avoid accidental and unnecessary token usage.

@DavdGao DavdGao self-requested a review April 17, 2026 03:54
… tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Jinhaooo Jinhaooo force-pushed the ci/test-glm-chat-model branch from 10fa5d6 to af14646 Compare April 20, 2026 02:54
@Jinhaooo Jinhaooo changed the base branch from main to v2_dev April 20, 2026 03:01
@Jinhaooo
Copy link
Copy Markdown
Author

@DavdGao Thanks for the review! I've rebased onto v2_dev and moved the integration tests and fixtures to scripts/smoke_test/. The mock unit tests remain in tests/ since they don't require API keys or consume any tokens — they're built from captured fixtures and are safe for routine CI runs. Let me know if you'd prefer those moved as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants