Release v1.22.0#3204
Conversation
Co-authored-by: openhands <openhands@all-hands.dev>
|
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly. |
|
Hi! I started running the behavior tests on your PR. You will receive a comment with the results shortly. |
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable with required fixes
Version Bumps
The version bumps are mechanically correct and consistent across all four packages (openhands-sdk, openhands-tools, openhands-workspace, openhands-agent-server) from 1.21.1 → 1.22.0. The eval workflow default is also updated correctly.
[CRITICAL ISSUES]
Deprecation Deadlines
The deprecation checker fails with 4 features that have passed their removal deadline:
$ python .github/scripts/check_deprecations.py
- [openhands-sdk] 'AgentSettings' (warn_call)
deprecated in: 1.17.0, removed in: 1.22.0
- [openhands-sdk] 'VerificationSettings.confirmation_mode' (warn_call)
deprecated in: 1.17.0, removed in: 1.22.0
- [openhands-sdk] 'VerificationSettings.security_analyzer' (warn_call)
deprecated in: 1.17.0, removed in: 1.22.0
- [openhands-sdk] f'Importing {name!r} from openhands.sdk.settings' (warn_call)
deprecated in: 1.19.0, removed in: 1.22.0Required action: These deprecations must be addressed before merging. The checklist item "Fix any deprecation deadlines if they exist" is currently unchecked.
SDK Policy Violation
The LLMAgentSettings import deprecation (deprecated in 1.19.0, removed in 1.22.0) only spans 3 minor releases, but the SDK policy requires at least 5 minor releases between deprecation and removal.
Recommendation: Either:
- Update the
removed_intarget to"1.24.0"(which would be 5 releases: 1.19→1.20→1.21→1.22→1.23→1.24), or - Remove the deprecation from this release and address it in a future version
The other three deprecations (1.17.0 → 1.22.0) correctly span 5 minor releases and are valid for removal.
[RISK ASSESSMENT]
This is a standard release version bump with no code changes. However, the unresolved deprecation deadlines pose a breaking change risk if not addressed before merge. The mechanical version changes themselves are low-risk.
VERDICT:
✅ Worth merging after fixes: Version bumps are correct, but deprecation deadlines must be addressed per the checklist and SDK policy.
KEY INSIGHT:
Release PRs should run the deprecation checker (python .github/scripts/check_deprecations.py) as part of the checklist to catch scheduled removals before publishing.
all-hands-bot
left a comment
There was a problem hiding this comment.
❌ QA Report: FAIL
Version bump to 1.22.0 is complete across all packages, but deprecation deadline check is failing — blocking release.
Does this PR achieve its stated goal?
No. The PR's goal is to "prepare the release for version 1.22.0", but the deprecation check CI is failing with 3 deprecated features that have reached their removal deadline in 1.22.0. According to the PR checklist, "Fix any deprecation deadlines if they exist" is a required step, and this has not been completed.
| Phase | Result |
|---|---|
| Environment Setup | ✅ Packages build successfully |
| CI Status | ❌ Deprecation check failing + other checks pending |
| Functional Verification | ✅ All 4 packages versioned to 1.22.0, imports work |
Functional Verification
Test 1: Version Consistency Check
Step 1 — Verify version in source files:
Ran:
grep -E "^version = " openhands-*/pyproject.tomlOutput:
openhands-agent-server/pyproject.toml:version = "1.22.0"
openhands-sdk/pyproject.toml:version = "1.22.0"
openhands-tools/pyproject.toml:version = "1.22.0"
openhands-workspace/pyproject.toml:version = "1.22.0"
This confirms all 4 packages declare version 1.22.0 in their pyproject.toml files.
Step 2 — Verify lockfile consistency:
Ran:
grep -A 2 "^name = \"openhands-" uv.lock | grep -E "(^name|^version)"Output:
name = "openhands-agent-server"
version = "1.22.0"
name = "openhands-sdk"
version = "1.22.0"
name = "openhands-tools"
version = "1.22.0"
name = "openhands-workspace"
version = "1.22.0"
This confirms uv.lock matches the pyproject.toml versions.
Step 3 — Verify eval workflow default:
Ran:
grep -A 3 "sdk_ref:" .github/workflows/run-eval.yml | grep "default:"Output:
default: v1.22.0
This confirms the eval workflow default was updated from v1.21.1 to v1.22.0.
Step 4 — Build and install packages:
Ran:
uv sync --frozenResult: All 4 packages built successfully.
Step 5 — Verify installed versions:
Ran:
import importlib.metadata
for pkg in ['openhands-sdk', 'openhands-tools', 'openhands-workspace', 'openhands-agent-server']:
print(f'{pkg}: {importlib.metadata.version(pkg)}')Output:
openhands-sdk: 1.22.0
openhands-tools: 1.22.0
openhands-workspace: 1.22.0
openhands-agent-server: 1.22.0
This confirms all installed packages report version 1.22.0.
Step 6 — Smoke test basic functionality:
Ran:
from openhands.sdk import Agent, LLM, Tool, Conversation
from openhands.tools.terminal import TerminalTool
from openhands.tools.file_editor import FileEditorTool
print("✓ Imports successful")
print(f"✓ TerminalTool available: {TerminalTool.name}")
print(f"✓ FileEditorTool available: {FileEditorTool.name}")Output:
✓ Imports successful
✓ TerminalTool available: terminal
✓ FileEditorTool available: file_editor
✓ All basic functionality verified
This confirms the packages work correctly after the version bump.
CI Check Failure Detail
Failed Check: Deprecation Verification
The check / Verify deprecation removals CI check is failing with the following deprecations that have reached their removal deadline:
- [openhands-sdk] 'AgentSettings' (warn_call)
deprecated in: 1.17.0
removed in: 1.22.0
defined at: openhands-sdk/openhands/sdk/settings/model.py:1296
- [openhands-sdk] 'VerificationSettings.confirmation_mode' (warn_call)
deprecated in: 1.17.0
removed in: 1.22.0
defined at: openhands-sdk/openhands/sdk/settings/model.py:270
- [openhands-sdk] 'VerificationSettings.security_analyzer' (warn_call)
deprecated in: 1.17.0
removed in: 1.22.0
defined at: openhands-sdk/openhands/sdk/settings/model.py:284
These deprecations must be removed before releasing 1.22.0 per the SDK's deprecation policy.
Workflow URL: https://github.com/OpenHands/software-agent-sdk/actions/runs/25679136103/job/75385452790
Issues Found
- 🔴 Blocker: Three deprecated features have reached their removal deadline in 1.22.0 but have not been removed, causing the deprecation check to fail. These must be fixed before release:
AgentSettingsatopenhands-sdk/openhands/sdk/settings/model.py:1296VerificationSettings.confirmation_modeatopenhands-sdk/openhands/sdk/settings/model.py:270VerificationSettings.security_analyzeratopenhands-sdk/openhands/sdk/settings/model.py:284
Coverage Report •
|
||||||||||||||||||||||||||||||
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 24.9s | $0.03 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 19.9s | $0.03 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 12.3s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 31.6s | $0.03 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 13.0s | $0.02 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 37.3s | $0.03 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 31.7s | $0.04 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 8.7s | $0.00 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 38.8s | $0.03 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 2m 6s | $0.15 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 22.3s | $0.02 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 21.6s | $0.02 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 15.2s | $0.02 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 18.3s | $0.03 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 10.2s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 14.6s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 1m 48s | $0.02 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 4m 44s | $0.35 |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 53.4s | $0.06 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 16.9s | $0.03 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 32.1s | $0.02 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 37.5s | $0.02 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 8.3s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ✅ PASS | 2m 20s | $0.16 |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 19.5s | $0.02 |
| 01_standalone_sdk/34_critic_example.py | ✅ PASS | 2m 47s | $0.23 |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 9.8s | $0.00 |
| 01_standalone_sdk/37_llm_profile_store/main.py | ✅ PASS | 5.2s | $0.00 |
| 01_standalone_sdk/38_browser_session_recording.py | ✅ PASS | 34.0s | $0.03 |
| 01_standalone_sdk/39_llm_fallback.py | ✅ PASS | 9.6s | $0.00 |
| 01_standalone_sdk/40_acp_agent_example.py | ✅ PASS | 29.6s | $0.32 |
| 01_standalone_sdk/41_task_tool_set.py | ✅ PASS | 24.7s | $0.03 |
| 01_standalone_sdk/42_file_based_subagents.py | ✅ PASS | 53.2s | $0.06 |
| 01_standalone_sdk/43_mixed_marketplace_skills/main.py | ✅ PASS | 6.4s | $0.00 |
| 01_standalone_sdk/44_model_switching_in_convo.py | ✅ PASS | 7.8s | $0.01 |
| 01_standalone_sdk/45_parallel_tool_execution.py | ✅ PASS | 3m 2s | $0.44 |
| 01_standalone_sdk/46_agent_settings.py | ✅ PASS | 10.6s | $0.01 |
| 01_standalone_sdk/47_defense_in_depth_security.py | ✅ PASS | 2.8s | $0.00 |
| 01_standalone_sdk/48_conversation_fork.py | ✅ PASS | 14.2s | $0.00 |
| 01_standalone_sdk/49_switch_llm_tool.py | ✅ PASS | 11.2s | $0.03 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 37.3s | $0.02 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ✅ PASS | 2m 0s | $0.04 |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ✅ PASS | 1m 5s | $0.07 |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ✅ PASS | 1m 58s | $0.03 |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 25.5s | $0.03 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ✅ PASS | 3m 41s | $0.03 |
| 02_remote_agent_server/09_acp_agent_with_remote_runtime.py | ✅ PASS | 1m 23s | $0.13 |
| 02_remote_agent_server/10_cloud_workspace_share_credentials.py | ✅ PASS | 48.1s | $0.12 |
| 02_remote_agent_server/11_conversation_fork.py | ✅ PASS | 35.9s | $0.00 |
| 02_remote_agent_server/12_settings_and_secrets_api.py | ✅ PASS | 2m 9s | $0.02 |
| 02_remote_agent_server/13_workspace_get_llm.py | ✅ PASS | 11.0s | $0.01 |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 25.1s | $0.02 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 43.2s | $0.07 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 16.4s | $0.02 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ✅ PASS | 20.5s | $0.02 |
✅ All tests passed!
Total: 55 | Passed: 55 | Failed: 0 | Total Cost: $2.95
🧪 Integration Tests ResultsOverall Success Rate: 95.0% 📁 Detailed Logs & ArtifactsClick the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.
📊 Summary
📋 Detailed Resultslitellm_proxy_moonshot_kimi_k2.6
litellm_proxy_gemini_3.1_pro_preview
litellm_proxy_anthropic_claude_sonnet_4_6
Failed Tests:
However, the agent violated the explicit evaluation criteria by creating an unauthorized file: The AGENTS.md file:
The agent even explicitly acknowledged creating this file in the final summary: "An While the content of AGENTS.md is potentially useful, its creation directly contradicts the stated evaluation criteria. The violation is clear and unambiguous, though the severity is moderate given that the primary deliverable (the training script) is high quality and properly implements the requested functionality. (confidence=0.75) (Cost: $1.70) litellm_proxy_deepseek_deepseek_v4_flash
|
🧪 Integration Tests ResultsOverall Success Rate: 97.1% 📁 Detailed Logs & ArtifactsClick the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.
📊 Summary
📋 Detailed Resultslitellm_proxy_moonshot_kimi_k2.6
litellm_proxy_gemini_3.1_pro_preview
litellm_proxy_anthropic_claude_sonnet_4_6
Failed Tests:
litellm_proxy_deepseek_deepseek_v4_flash
Skipped Tests:
|
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly. |
|
I just brought this release up to date with the latest main, so we are rerunning the integration tests and example tests. Once they are all passing, we can get it merged. |
🧪 Integration Tests ResultsOverall Success Rate: 97.1% 📁 Detailed Logs & ArtifactsClick the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.
📊 Summary
📋 Detailed Resultslitellm_proxy_moonshot_kimi_k2.6
litellm_proxy_gemini_3.1_pro_preview
litellm_proxy_anthropic_claude_sonnet_4_6
Failed Tests:
litellm_proxy_deepseek_deepseek_v4_flash
Skipped Tests:
|
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 23.1s | $0.02 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 21.2s | $0.03 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 12.9s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 26.8s | $0.02 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 11.1s | $0.01 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 32.4s | $0.03 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 24.2s | $0.03 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 9.4s | $0.01 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 55.4s | $0.06 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 3m 11s | $0.14 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 20.3s | $0.02 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 20.0s | $0.02 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 18.7s | $0.02 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 14.6s | $0.02 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 9.6s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 12.9s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 1m 42s | $0.02 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 5m 16s | $0.41 |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 1m 10s | $0.08 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 19.0s | $0.03 |
| 01_standalone_sdk/28_ask_agent_example.py | ❌ FAIL Exit code 1 |
10.2s | -- |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 36.1s | $0.02 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 8.9s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ✅ PASS | 4m 41s | $0.33 |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 24.7s | $0.04 |
| 01_standalone_sdk/34_critic_example.py | ✅ PASS | 1m 21s | $0.10 |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 11.0s | $0.01 |
| 01_standalone_sdk/37_llm_profile_store/main.py | ✅ PASS | 3.8s | $0.00 |
| 01_standalone_sdk/38_browser_session_recording.py | ✅ PASS | 33.4s | $0.03 |
| 01_standalone_sdk/39_llm_fallback.py | ✅ PASS | 10.7s | $0.01 |
| 01_standalone_sdk/40_acp_agent_example.py | ✅ PASS | 51.3s | $0.32 |
| 01_standalone_sdk/41_task_tool_set.py | ✅ PASS | 27.7s | $0.03 |
| 01_standalone_sdk/42_file_based_subagents.py | ✅ PASS | 1m 31s | $0.09 |
| 01_standalone_sdk/43_mixed_marketplace_skills/main.py | ✅ PASS | 6.5s | $0.00 |
| 01_standalone_sdk/44_model_switching_in_convo.py | ✅ PASS | 8.0s | $0.01 |
| 01_standalone_sdk/45_parallel_tool_execution.py | ✅ PASS | 3m 13s | $0.49 |
| 01_standalone_sdk/46_agent_settings.py | ✅ PASS | 8.7s | $0.00 |
| 01_standalone_sdk/47_defense_in_depth_security.py | ✅ PASS | 3.3s | $0.00 |
| 01_standalone_sdk/48_conversation_fork.py | ✅ PASS | 12.9s | $0.00 |
| 01_standalone_sdk/49_switch_llm_tool.py | ✅ PASS | 7.9s | $0.03 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 37.5s | $0.03 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ✅ PASS | 1m 21s | $0.04 |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ✅ PASS | 1m 15s | $0.06 |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ✅ PASS | 1m 33s | $0.04 |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 24.5s | $0.03 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ✅ PASS | 3m 23s | $0.02 |
| 02_remote_agent_server/09_acp_agent_with_remote_runtime.py | ✅ PASS | 53.8s | $0.13 |
| 02_remote_agent_server/10_cloud_workspace_share_credentials.py | ✅ PASS | 37.6s | $0.07 |
| 02_remote_agent_server/11_conversation_fork.py | ✅ PASS | 36.4s | $0.00 |
| 02_remote_agent_server/12_settings_and_secrets_api.py | ✅ PASS | 2m 11s | $0.02 |
| 02_remote_agent_server/13_workspace_get_llm.py | ✅ PASS | 19.4s | $0.01 |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 46.3s | $0.04 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 43.7s | $0.05 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 16.9s | $0.01 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ✅ PASS | 22.8s | $0.02 |
❌ Some tests failed
Total: 55 | Passed: 54 | Failed: 1 | Total Cost: $3.10
Failed examples:
- examples/01_standalone_sdk/28_ask_agent_example.py: Exit code 1
Release v1.22.0
This PR prepares the release for version 1.22.0.
Release Checklist
integration-test)behavior-test)test-examples)What happens on merge
When this PR is merged, the
create-release.ymlworkflow will automatically:v1.22.0and auto-generated notespypi-release.ymlto publish all packages to PyPIversion-bump-prs.ymlto create downstream version bump PRsAgent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:d13ec0a-pythonRun
All tags pushed for this build
About Multi-Architecture Support
d13ec0a-python) is a multi-arch manifest supporting both amd64 and arm64d13ec0a-python-amd64) are also available if needed