fix(blender): unblock send_command hang against persistent-connection server (#1022)#1026
Open
kovtcharov wants to merge 3 commits intomainfrom
Open
fix(blender): unblock send_command hang against persistent-connection server (#1022)#1026kovtcharov wants to merge 3 commits intomainfrom
kovtcharov wants to merge 3 commits intomainfrom
Conversation
… server (#1022) Client used a chunk-until-EOF recv loop, but the Blender addon's server keeps the connection open to accept further commands on the same socket. Both ends blocked on recv() forever after the first response. Switch to incremental JSON parsing (matching the server's own framing in blender_mcp_server.py:128-166) and add a 30s socket timeout so a misbehaving server can't hang the agent indefinitely and Ctrl-C remains responsive. Adds a socket-level regression test using a fake persistent-connection server, so this can be exercised in CI without a real Blender install.
3 tasks
…t timeouts Follow-up hardening from a deeper review of the #1022 fix: * Tolerate UnicodeDecodeError in the recv loop. A multi-byte UTF-8 character (e.g. a non-ASCII object name in an error message) split across two recv() boundaries was raising UnicodeDecodeError and bubbling out as a hard MCPError; now treated identically to JSONDecodeError — keep buffering. * Treat ConnectionResetError as "connection closed before complete response received" rather than letting it surface as a raw OSError. Some servers send RST instead of FIN when closing mid-response. * Bump the default per-recv socket timeout from 30s to 120s. Long Blender operations such as render/simulate via execute_code can sit silent for tens of seconds at a time; the original 30s was tight enough to misfire in legitimate workflows. Tests added: * test_send_command_assembles_chunked_response — splits a non-ASCII response across two send() calls with a delay, exercising the incremental-parse loop and the UnicodeDecodeError tolerance. * test_send_command_raises_on_premature_close — server closes after sending a partial JSON; client must surface a clear MCPError. Also tightened the connection-refused test to use a kernel-assigned free port instead of port 1, for CI portability.
2 tasks
Addresses items raised in a follow-up review of the original fix:
* Use json.JSONDecoder().raw_decode() instead of json.loads() so a
hypothetically pipelined server response ({"a":1}{"b":2}) yields the
first object instead of looping until timeout. The current server
sends one response per command, but raw_decode is cheap and removes
a foot-gun.
* Split UTF-8 decoding from JSON parsing into separate try blocks. The
prior single try caught UnicodeDecodeError but only because of the
combined except clause; this is clearer and more robust.
* Cap the response buffer at 16 MB. A misbehaving server that trickled
bytes just under each per-recv timeout would otherwise grow the
buffer without bound. 16 MB leaves enormous headroom for legitimate
execute_code stdout payloads.
* Broaden the close-mid-response catch to ConnectionResetError +
ConnectionAbortedError + BrokenPipeError. Previously only
ConnectionResetError was handled specifically; the others fell
through to the generic Exception handler with a less-actionable
message.
* Drop self.log.setLevel(logging.INFO) from MCPClient.__init__ — it
mutated the global module logger every time a client was constructed,
overriding the user's logging config.
* Bump execute_code's default timeout from 120s (inherited) to 600s
with a docstring explaining why — bpy.ops.render.render() and
similar can sit silent for many seconds without emitting data.
Test changes:
* test_send_command_assembles_chunked_response now actually splits
mid-emoji. The previous version split at offset 20, which landed in
pure ASCII; the UnicodeDecodeError path was uncovered. Compute the
split offset from the JSON-encoded payload (with ensure_ascii=False)
so the cut lands inside the 4-byte UTF-8 codepoint.
* New test_send_command_surfaces_enhanced_error_message — covers
_enhance_error_message, which had no prior test coverage.
* _ChunkedServer now serializes with ensure_ascii=False so multi-byte
payloads land on the wire as raw UTF-8 (matching the path the
UnicodeDecodeError tolerance defends against).
5/5 tests in tests/unit/mcp/test_blender_mcp_client.py pass; 123/123
in the broader scope; lint clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why this matters
Before: on v0.17.6, running
gaia blender ...against the Blender MCP addon hangs forever after the first tool call (e.g.clear_scene), and Ctrl-C does not interrupt it. Reported in #1022.After:
send_commandreads the response incrementally and returns as soon as a complete JSON document parses out of the buffer — matching the framing the server already uses. A 30 s socket timeout also keeps Ctrl-C responsive.The bug was a protocol mismatch between client and server: the client looped on
recv()until FIN, but the Blender addon keeps the connection open to accept further commands on the same socket, so FIN never came. Both ends ended up blocked onrecv()waiting for each other.Adds
tests/unit/mcp/test_blender_mcp_client.py— a socket-level regression test using a fake persistent-connection server. Runs in CI without a Blender install.Fixes #1022. Plan for the broader Blender modernization is tracked in #1025.
Test plan