Add LLM profile switch tool#3188
Conversation
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Co-authored-by: openhands <openhands@all-hands.dev>
24dd9ee to
d18c898
Compare
all-hands-bot
left a comment
There was a problem hiding this comment.
Clean implementation of an optional LLM profile switching tool. Follows existing patterns (similar to InvokeSkillTool), has good test coverage, and includes clear error handling.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW
Adds optional built-in tool without modifying existing behavior. Well-tested with focused unit tests covering profile listing, successful switching, and error cases. No eval-risk concerns since this is an opt-in tool that must be explicitly enabled via include_default_tools=["SwitchLLMTool"].
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
SwitchLLMTool successfully enables agents to switch between saved LLM profiles during conversation execution, with proper error handling and state persistence.
Does this PR achieve its stated goal?
Yes. The PR set out to "add an optional built-in SwitchLLMTool that lets an agent switch the conversation to a saved LLM profile." The implementation delivers exactly this:
- Tool creation and registration: The
SwitchLLMToolis correctly registered inBUILT_IN_TOOL_CLASSESand can be instantiated viainclude_default_tools=["SwitchLLMTool"]. - Profile switching: The tool successfully switches the conversation's LLM from one saved profile to another, updating both
conversation.agent.llm.modelandconversation.state.agent.llm.model. - Profile discovery: The tool description dynamically lists all available profiles from
LLMProfileStore, making them visible to the agent. - Error handling: Missing profiles are caught and reported without crashing or leaving the conversation in an invalid state.
- Multiple switches: Sequential profile switches work correctly, allowing an agent to change models multiple times during a single conversation.
Evidence: Created three test LLM profiles (fast, slow, powerful), used SwitchLLMTool to switch from default model (gpt-4o-mini) → powerful (claude-3-5-sonnet-20241022) → fast (gpt-4o-mini). Each switch updated the active model correctly. Attempting to switch to a non-existent profile returned an error observation without changing the model.
| Phase | Result |
|---|---|
| Environment Setup | ✅ Dependencies installed, project builds successfully |
| CI Status | ✅ All core checks pass (sdk-tests, tools-tests, pre-commit, coverage-report) |
| Functional Verification | ✅ Tool switches profiles, lists available profiles, handles errors correctly |
Functional Verification
Test 1: Tool Registration and Discovery
Verification:
Confirmed SwitchLLMTool is registered in BUILT_IN_TOOL_CLASSES and can be instantiated:
from openhands.sdk.tool.builtins import BUILT_IN_TOOL_CLASSES
print("SwitchLLMTool" in BUILT_IN_TOOL_CLASSES) # True
print(BUILT_IN_TOOL_CLASSES.get("SwitchLLMTool")) # <class '...SwitchLLMTool'>Result: ✓ Tool is correctly registered and discoverable via include_default_tools.
Test 2: Profile Listing in Tool Description
Setup:
Created three LLM profiles in a temporary profile store:
fast.json(model: gpt-4o-mini)slow.json(model: gpt-4o)powerful.json(model: claude-3-5-sonnet-20241022)
Verification:
Called SwitchLLMTool.create() and inspected the tool description:
Available LLM profiles:
- fast
- powerful
- slow
Result: ✓ Tool description correctly lists all available profiles in sorted order.
Test 3: Successful Profile Switch
Baseline (before switch):
Created a conversation with default model:
default_llm = TestLLM.from_messages([], model="gpt-4o-mini", usage_id="default")
agent = Agent(llm=default_llm, tools=[], include_default_tools=["SwitchLLMTool"])
conversation = LocalConversation(agent=agent, workspace=Path.cwd())
print(conversation.agent.llm.model) # Output: gpt-4o-miniThis confirms the conversation starts with the default model.
Action:
Executed the SwitchLLMTool to switch to the "powerful" profile:
observation = conversation.execute_tool(
"switch_llm",
SwitchLLMAction(profile_name="powerful", reason="Need more powerful model")
)Result (after switch):
Observation text: Switched LLM profile to 'powerful'. Future agent steps will use this profile.
Is error: False
Profile name: powerful
Active model: claude-3-5-sonnet-20241022
Current conversation model: claude-3-5-sonnet-20241022
Verified both the agent's LLM and the conversation state were updated:
assert conversation.agent.llm.model == "claude-3-5-sonnet-20241022"
assert conversation.state.agent.llm.model == "claude-3-5-sonnet-20241022"Interpretation: The switch from gpt-4o-mini to claude-3-5-sonnet-20241022 was successful. Both the runtime agent and the persisted conversation state reflect the new model.
Test 4: Error Handling for Missing Profile
Setup:
Conversation is currently using the "powerful" profile (claude-3-5-sonnet-20241022).
Action:
Attempted to switch to a non-existent profile:
error_observation = conversation.execute_tool(
"switch_llm",
SwitchLLMAction(profile_name="nonexistent", reason="Testing error handling")
)Result:
Observation text: LLM profile 'nonexistent' was not found.
Is error: True
Current model (should be unchanged): claude-3-5-sonnet-20241022
Verified the model remained unchanged:
assert conversation.agent.llm.model == "claude-3-5-sonnet-20241022"
assert conversation.state.agent.llm.model == "claude-3-5-sonnet-20241022"Interpretation: The tool correctly handles missing profiles by returning an error observation without modifying the conversation state. The agent continues using the previous model.
Test 5: Multiple Sequential Switches
Setup:
Conversation is using the "powerful" profile.
Action:
Switched to the "fast" profile:
observation2 = conversation.execute_tool(
"switch_llm",
SwitchLLMAction(profile_name="fast", reason="Switching to faster model")
)Result:
Observation text: Switched LLM profile to 'fast'. Future agent steps will use this profile.
Current model: gpt-4o-mini
Verified the second switch succeeded:
assert conversation.agent.llm.model == "gpt-4o-mini"Interpretation: Multiple profile switches work correctly. The conversation successfully transitioned from default → powerful → fast without issues.
Test 6: Visualization Methods
Verification:
Tested the visualize property on both SwitchLLMAction and SwitchLLMObservation:
-
Action visualization:
Switch LLM profile: gpt-4o Reason: Need more powerful model for complex reasoning -
Success observation visualization:
Switched LLM profile: fast-model (gpt-4o-mini) -
Error observation visualization:
Failed to switch LLM profile: nonexistent
Result: ✓ All visualization methods produce correctly formatted Rich Text objects with appropriate styling.
Issues Found
None.
This QA report was created by an AI agent (OpenHands) on behalf of the user.
Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands address review comments and then merge this PR. |
|
I'm on it! neubig can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
|
OpenHands encountered an error: Request timeout after 30 seconds to https://xielshjxxiiokogz.prod-runtime.all-hands.dev/api/conversations/605c645c-2dea-471e-92fc-e41c1996498e/ask_agent See the conversation for more information. |
Summary
SwitchLLMToolthat lets an agent switch the conversation to a saved LLM profile.switch_llmto move to Claude, and confirms the active model changed.Validation
uv run pytest tests/sdk/tool/test_switch_llm.py tests/sdk/conversation/test_switch_model.py -quv run pytest tests/sdk/tool/test_builtins.py tests/sdk/agent/test_agent_tool_init.py -qenv -u LMNR_PROJECT_API_KEY -u LMNR_BASE_URL -u LMNR_FORCE_HTTP uv run pytest tests/sdk/tool/test_switch_llm.py -qenv -u LMNR_PROJECT_API_KEY -u LMNR_BASE_URL -u LMNR_FORCE_HTTP uv run pre-commit run --files openhands-sdk/openhands/sdk/tool/builtins/switch_llm.py tests/sdk/tool/test_switch_llm.py examples/01_standalone_sdk/49_switch_llm_tool.pyenv -u LMNR_PROJECT_API_KEY -u LMNR_BASE_URL -u LMNR_FORCE_HTTP OPENHANDS_SUPPRESS_BANNER=1 LLM_API_KEY=... LLM_BASE_URL=https://llm-proxy.app.all-hands.dev uv run python examples/01_standalone_sdk/49_switch_llm_tool.pyopenai/gpt-5.5.switch_llmwithprofile_name='example-claude'.openai/prod/claude-sonnet-4-5-20250929and reported that active model.EXAMPLE_COST: 0.034233.This pull request was updated by an AI agent (OpenHands) on behalf of the user.
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:944e4c9-pythonRun
All tags pushed for this build
About Multi-Architecture Support
944e4c9-python) is a multi-arch manifest supporting both amd64 and arm64944e4c9-python-amd64) are also available if needed