Skip to content

fix: permanent fd redirect for stdout suppression (issue #42)#45

Closed
brianlball wants to merge 1 commit intodevelopfrom
fix/stdout-redirect
Closed

fix: permanent fd redirect for stdout suppression (issue #42)#45
brianlball wants to merge 1 commit intodevelopfrom
fix/stdout-redirect

Conversation

@brianlball
Copy link
Copy Markdown
Collaborator

Fixes #42

Problem

Two related bugs corrupt the MCP JSON-RPC stream on stdout:

  1. Concurrent tool timeout (issue MCP error -32001: Request timed out on all tools — stdout suppression race condition #42): Global FastMCP middleware held os.dup2() on fd 1 for the entire tool call. Concurrent worker threads interleaved their fd redirects, sending JSON-RPC responses to stderr — clients received nothing → -32001 timeout.

  2. Polyhedron stdout leak: OpenStudio's C++ geometry engine writes [utilities.Polyhedron] diagnostics directly to fd 1. These aren't SWIG GC warnings — they fire during read-only queries like get_building_info on complex models. Any per-callsite suppression approach misses them.

Root Cause

Both bugs stem from the same design flaw: toggling fd 1 back and forth between stdout and stderr on a per-call basis. This is inherently racy under concurrency and inherently incomplete against unknown C-level callsites.

Fix

Redirect fd 1 to stderr once at startup, then give Python sys.stdout a private fd to the real MCP client pipe.

saved_fd = os.dup(1)        # copy real stdout pipe
os.dup2(2, 1)               # fd 1 → stderr permanently
sys.stdout = fdopen(saved)  # Python writes to saved fd

After this:

  • C code (printf, SWIG, Polyhedron, anything) → fd 1 → stderr. Harmless.
  • Python (FastMCP JSON-RPC) → saved fd → MCP client. Clean.
  • No toggling, no races, no per-callsite wrappers, no missed callsites.

Files Changed (4)

File Change
stdout_suppression.py Replace toggle context manager with one-shot redirect_c_stdout_to_stderr(). Keep suppress_openstudio_warnings() as no-op for backward compat.
server.py Call redirect_c_stdout_to_stderr() before mcp.run()
tests/test_concurrent_tools.py Regression test: concurrent slow+fast and fast+fast tool calls
.github/workflows/ci.yml Add concurrent test to shard 5

Test Results

Test Before After
Concurrent slow+fast (baseline_osm + status) FAIL — 30s timeout PASS
Concurrent fast+fast (status + status) FAIL — connection closed PASS
Stdout purity (44-zone Polyhedron) FAIL PASS
Full CI shard 4 (162 tests) 1 failed 0 failed

Supersedes PR #43

PR #43 (from anchapin's fork) used per-callsite targeted suppression with RLock. That approach fixes the race but misses C-level stdout from unexpected callsites (Polyhedron geometry warnings). This PR uses a permanent redirect that catches everything.

🤖 Generated with Claude Code

Redirect C-level stdout (fd 1) to stderr once at startup, give
Python sys.stdout a private fd to the real MCP client pipe.
Catches ALL C-level pollution (SWIG GC, Polyhedron geometry,
future unknowns) with zero races and no per-callsite wrappers.

Fixes concurrent tool timeout (issue #42) and Polyhedron stdout
leak on complex models (test_complex_model_stdout_purity).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@brianlball
Copy link
Copy Markdown
Collaborator Author

Closing — will apply fix directly to optimize branch and merge optimize→develop instead.

@brianlball brianlball closed this Apr 10, 2026
@brianlball brianlball deleted the fix/stdout-redirect branch April 11, 2026 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP error -32001: Request timed out on all tools — stdout suppression race condition

1 participant