chore(sdk): drop variable-sized payloads from info logs#3217
Conversation
Strip full state/agent model dumps and tool / skill / MCP-server name lists out of info logs. The accompanying counts are still logged so 'how many X were loaded' remains visible without dumping the contents. - conversation/state: 'Resumed conversation <id>' and 'Created new conversation <id>' no longer append `state.model_dump(...)` and `agent.model_dump_succint()` — those dump the entire ConversationState (including workspace, persistence info, etc.) and the agent config across many wrapped log lines on startup. - mcp/utils: 'Created N MCP tools' no longer enumerates every tool name. - agent/base: 'Loaded N tools from spec' / 'Filtered to N tools after applying regex filter' no longer enumerate every tool name. - skills/skill: 'Loaded N public skills' no longer enumerates every skill name. - plugin/plugin: 'Loaded MCP config from <path> with N server(s)' no longer enumerates every server name. Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
LGTM! 👍 Clean logging improvement that makes SDK logs more readable while preserving essential information (counts and IDs).
…fo logs Codify the policy uncovered while toning down agent-server / SDK info logs (PRs #3216, #3217). Reviewers should flag `logger.info(...)` calls that interpolate `model_dump(...)`, `.json()`, `to_dict()`, lists, dicts, or other unbounded values — those belong in `logger.debug(...)`. Adds: - A "Logging Hygiene" subsection under SDK Architecture Conventions with concrete bad/good examples drawn from real cases in this repo. - A "What to Check" bullet pointing at the new section. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified that log output has been cleaned up as promised — verbose multi-line dumps replaced with concise single-line logs preserving essential counts and IDs.
Does this PR achieve its stated goal?
Yes. The PR successfully strips variable-length payloads from info logs across all modified files. Running actual SDK code confirms that conversation creation logs dropped from ~15+ lines of state/agent dumps to a single line with just the conversation ID, and tool loading logs now show counts instead of enumerating every tool name. All functionality remains intact — conversations create correctly, agents initialize properly, and state persists as expected.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build succeeded, SDK packages installed |
| CI Status | ✅ All tests passing (sdk-tests, agent-server-tests, pre-commit, windows-tests) |
| Functional Verification | ✅ Logs cleaned up as claimed, SDK functionality preserved |
Functional Verification
Test 1: Conversation Creation Logs (state.py)
Step 1 — Baseline (main branch):
Ran test script that creates a new conversation:
conversation = Conversation(agent=agent, workspace=workspace_dir, persistence_dir=persist_dir)Log output on main:
INFO Created new conversation da401713-b79d-46f5-8502-0ed4f4fb70ea
State: {'id': UUID('da401713-b79d-46f5-8502-0ed4f4fb70ea'), 'workspace': {'working_dir': '/tmp/tmplli1ju0i/workspace', 'kind': 'LocalWorkspace'}, 'persistence_dir': '/tmp/tmplli1ju0i/persistence1/da401713b79d46f585020ed4f4fb70ea', 'max_iterations': 500, 'stuck_detection': True, 'execution_status': <ConversationExecutionStatus.IDLE: 'idle'>, 'confirmation_policy': {'kind': 'NeverConfirm'}, 'security_analyzer': None, 'activated_knowledge_skills': [], 'invoked_skills': [], 'blocked_actions': {}, 'blocked_messages': {}, 'last_user_message_id': None, 'stats': {'usage_to_metrics': {}}, 'secret_registry': {'secret_sources': {}}, 'tags': {}, 'agent_state': {}, 'hook_config': None}
Agent: {'llm': {'model': 'anthropic/claude-sonnet-4-5-20250929', 'api_key': SecretStr('**********'), 'openrouter_site_url': 'https://docs.all-hands.dev/', 'openrouter_app_name': 'OpenHands', 'num_retries': 5, 'retry_multiplier': 8.0, 'retry_min_wait': 8, 'retry_max_wait': 64, 'timeout': 300, 'max_message_chars': 30000, 'max_input_tokens': 200000, 'max_output_tokens': 64000, 'stream': False, 'drop_params': True, 'modify_params': True, 'disable_stop_word': False, 'caching_prompt': True, 'log_completions': False, 'log_completions_folder': 'logs/completions', 'native_tool_calling': True, 'reasoning_effort': 'high', 'enable_encrypted_reasoning': True, 'prompt_cache_retention': '24h', 'extended_thinking_budget': 200000, 'usage_id': 'default', 'litellm_extra_body': {}}, 'tools': [{'name': 'terminal', 'params': {}}, {'name': 'file_editor', 'params': {}}, {'name': 'task_tracker', 'params': {}}], 'include_default_tools': ['FinishTool', 'ThinkTool'], 'system_prompt_filename': 'system_prompt.j2', 'security_policy_filename': 'security_policy.j2', 'system_prompt_kwargs': {'llm_security_analyzer': True}, 'tool_concurrency_limit': 1, 'kind': 'Agent'}
Interpretation: The log spans multiple lines with complete state and agent dictionaries — hundreds of characters of data that grows with configuration complexity. This confirms the verbosity problem the PR aims to fix.
Step 2 — Apply PR changes:
Checked out PR branch drop-state-and-tools-info-dumps and reinstalled SDK packages.
Step 3 — Re-run with the fix:
Ran the same conversation creation code:
Log output on PR branch:
INFO Created new conversation 8cc96413-fe79-490f-870e-af8688f5b446
Interpretation: The log is now a single concise line showing only the essential information (conversation ID). The state and agent dumps are completely removed, reducing log noise by >90% while preserving the key identifier needed for debugging.
Test 2: Tool Loading Logs (agent/base.py)
Step 1 — Baseline (main branch):
Ran test that initializes an agent with multiple tools and a regex filter:
agent = Agent(llm=llm, tools=[...], filter_tools_regex=r"terminal_.*")
conversation.agent.init_state(conversation.state, lambda x: None)Log output on main:
openhands.sdk.agent.base INFO Loaded 3 tools from spec: ['terminal', 'file_editor', 'task_tracker']
openhands.sdk.agent.base INFO Filtered to 0 tools after applying regex filter: []
Interpretation: Tool names are enumerated in the log, which becomes verbose when agents have many tools (imagine 20+ tools from MCP servers). This confirms the variable-length payload issue.
Step 2 — Apply PR changes:
Switched to PR branch and reinstalled.
Step 3 — Re-run with the fix:
Log output on PR branch:
openhands.sdk.agent.base INFO Loaded 3 tools from spec
openhands.sdk.agent.base INFO Filtered to 0 tools after applying regex filter
Interpretation: Tool names are no longer listed — only the count is shown. The log remains bounded regardless of how many tools are loaded, exactly as intended.
Test 3: Functionality Preservation
Verification approach: Created a test script that exercises core SDK functionality affected by the log changes:
# Create conversation
conversation = Conversation(agent=agent, workspace=workspace_dir, persistence_dir=persist_dir)
assert conversation.id is not None
# Initialize agent (triggers tool loading)
conversation.agent.init_state(conversation.state, lambda x: None)
# Verify state accessibility
assert conversation.state.id == conversation.idResult:
✓ Creating conversation...
✓ Conversation created with ID: 32d4eebb-1e27-4f11-b20d-178304f3d3c4
✓ Initializing agent...
✓ Agent initialized successfully
✓ State accessible
✅ All functionality tests passed!
Interpretation: All core functionality works correctly. The log changes are purely cosmetic — they removed verbose output without affecting any runtime behavior. Conversations create successfully, agents initialize with tools properly, and state management works as expected.
Issues Found
None.
Co-authored-by: openhands <openhands@all-hands.dev>
Follow-up to #3216. Same idea — strip variable-length payloads out of info logs — but for the SDK side. Some of these (notably
state.py) produce log entries that wrap across ~15+ lines on startup because they dump the entireConversationStateand agent config viamodel_dump(...). After this PR the lines stay short and bounded; counts and IDs are still logged.Before (representative)
After
Changes
conversation/state.py—Resumed conversation <id>andCreated new conversation <id>no longer appendstate.model_dump(...)andagent.model_dump_succint().mcp/utils.py—Created N MCP toolsno longer enumerates every tool name.agent/base.py—Loaded N tools from specandFiltered to N tools after applying regex filterno longer enumerate every tool name.skills/skill.py—Loaded N public skillsno longer enumerates every skill name.plugin/plugin.py—Loaded MCP config from <path> with N server(s)no longer enumerates every server name.Tests
All 1326 tests in
tests/sdk/{conversation,mcp,agent,skills,plugin}pass. No tests assert on the stripped substrings.This PR was opened by an AI agent (OpenHands) on behalf of the requester.
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:11f9184-pythonRun
All tags pushed for this build
About Multi-Architecture Support
11f9184-python) is a multi-arch manifest supporting both amd64 and arm6411f9184-python-amd64) are also available if needed