feat(mcp): Enable code mode for MCP environments #351

Darktex · 2026-01-31T01:00:51Z

Summary

Implements MCP code mode (CodeAct) per RFC 003 and GitHub issue #347. This enables agents to write Python code blocks that directly call environment tools as functions, avoiding the double-marshaling overhead (Python → JSON-RPC → Python) that occurs with local MCP servers when using tool-calling mode.

Add execute_code() method to MCPEnvironment for executing Python code with tools injected as direct callables
Add get_callables() method for extracting Python functions from FastMCP servers
Add HTTP /mcp endpoint for production mode inference (bypasses step() API)
Add use_production_mode option to MCPToolClient for direct MCP endpoint access
Support mixed local (direct Python call) and remote (JSON-RPC) MCP servers

Test plan

All 94 core tests pass
New tests for code mode selection (16 tests)
New tests for production mode routes (21 tests)
Existing MCP integration tests pass
Lint check passes

Closes #347

Implements CodeAct mode per RFC 003, allowing agents to write Python code blocks that directly call environment tools as functions. Key changes: - Add execute_code() method to MCPEnvironment for code execution - Add get_callables() for direct callable extraction from FastMCP servers - Add HTTP /mcp endpoint for production mode (bypasses step() API) - Add use_production_mode option to MCPToolClient - Support mixed local (direct call) and remote (JSON-RPC) servers This avoids double-marshaling (Python → JSON-RPC → Python) for local MCP servers when using code mode, improving performance.

greptile-apps · 2026-01-31T01:04:31Z

Greptile Overview

Greptile Summary

This PR implements MCP code mode (CodeAct) per RFC 003, enabling agents to write Python code blocks that directly call environment tools as functions, avoiding double-marshaling overhead for local MCP servers.

Key Changes

Code Mode Support: Added execute_code() method to MCPEnvironment that executes Python code with tools injected as direct callables
Local vs Remote Detection: get_callables() method extracts Python functions from FastMCP servers for direct invocation (local) or wraps them in JSON-RPC calls (remote)
Production Mode Endpoint: New HTTP POST /mcp endpoint provides direct MCP access bypassing step() API overhead (for inference/production use cases)
Client Support: MCPToolClient now supports use_production_mode option to access /mcp endpoint directly
Comprehensive Testing: 16 new tests for code mode selection, 21 new tests for production mode routes

Alignment Assessment

This implementation correctly follows RFC 003's design for code mode and maintains the dual API boundary:

Agent-facing MCP tools remain isolated from infrastructure controls (reset, step, state)
Production mode /mcp endpoint properly creates temporary environment instances with cleanup
Reward computation remains inside environment boundary (only called via step(), not production mode)
No violations of system invariants detected

Confidence Score: 4/5

Safe to merge after fixing the duplicated code in mcp_client.py
Implementation aligns with RFC 003, maintains system invariants, and has comprehensive test coverage. One syntax error (duplicated code) needs fixing before merge. No alignment concerns identified.
src/openenv/core/mcp_client.py requires attention - contains duplicated code at lines 139-148 that must be removed

Important Files Changed

Filename	Overview
src/openenv/core/mcp_client.py	added `use_production_mode` option and HTTP `/mcp` endpoint support, but contains duplicated code (lines 139-148)
src/openenv/core/env_server/mcp_environment.py	implemented `execute_code()` and `get_callables()` for CodeAct mode with proper local/remote server detection
src/openenv/core/env_server/http_server.py	added HTTP `/mcp` endpoint for production mode with proper JSON-RPC handling and environment cleanup

Sequence Diagram

sequenceDiagram
    participant Agent
    participant InfraClient as Infrastructure<br/>(EnvClient/MCPToolClient)
    participant HTTPServer as HTTP Server<br/>(http_server.py)
    participant MCPEnv as MCP Environment<br/>(mcp_environment.py)
    participant MCPServer as MCP Server<br/>(FastMCP)

    Note over Agent,MCPServer: Production Mode: Direct MCP Access (bypasses step API)
    Agent->>HTTPServer: POST /mcp<br/>{"jsonrpc":"2.0", "method":"tools/list"}
    HTTPServer->>MCPEnv: _env_factory()
    HTTPServer->>MCPEnv: list_tools()
    MCPEnv->>MCPServer: list_tools()
    MCPServer-->>MCPEnv: [tools]
    MCPEnv-->>HTTPServer: [tools]
    HTTPServer->>MCPEnv: close()
    HTTPServer-->>Agent: {"result": {"tools": [...]}}

    Note over Agent,MCPServer: Tool-Calling Mode (traditional MCP)
    InfraClient->>HTTPServer: POST /step<br/>CallToolAction("add", {a:1, b:2})
    HTTPServer->>MCPEnv: step(CallToolAction)
    MCPEnv->>MCPEnv: _handle_call_tool()
    MCPEnv->>MCPServer: call_tool("add", {a:1, b:2})
    MCPServer-->>MCPEnv: result=3
    MCPEnv->>MCPEnv: _compute_reward()
    MCPEnv-->>HTTPServer: CallToolObservation(result=3, reward=0.0)
    HTTPServer-->>InfraClient: StepResponse

    Note over Agent,MCPServer: Code Mode (CodeAct pattern - local optimization)
    InfraClient->>HTTPServer: POST /step<br/>CodeAction("result = add(1, 2)")
    HTTPServer->>MCPEnv: step(CodeAction)
    MCPEnv->>MCPEnv: execute_code()
    MCPEnv->>MCPEnv: get_callables()
    Note over MCPEnv: Local: Direct Python call<br/>Remote: JSON-RPC wrapper
    MCPEnv->>MCPServer: Direct fn call (local)<br/>OR JSON-RPC (remote)
    MCPServer-->>MCPEnv: result=3
    MCPEnv->>MCPEnv: _compute_reward()
    MCPEnv-->>HTTPServer: Observation(metadata={"result": 3}, reward=0.0)
    HTTPServer-->>InfraClient: StepResponse

greptile-apps

_{3 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-31T01:04:35Z

src/openenv/core/mcp_client.py

+        result = self.step(ListToolsAction())
+        self._tools_cache = result.observation.tools
+        return self._tools_cache
+
+        result = self.step(ListToolsAction())
+        self._tools_cache = result.observation.tools
+        return self._tools_cache
        result = self.step(ListToolsAction())
        self._tools_cache = result.observation.tools
        return self._tools_cache


duplicated code - these 3 identical blocks should be removed

Suggested change

result = self.step(ListToolsAction())

self._tools_cache = result.observation.tools

return self._tools_cache

result = self.step(ListToolsAction())

self._tools_cache = result.observation.tools

return self._tools_cache

result = self.step(ListToolsAction())

self._tools_cache = result.observation.tools

return self._tools_cache

result = self.step(ListToolsAction())

self._tools_cache = result.observation.tools

return self._tools_cache

Prompt To Fix With AI

This is a comment left during a code review. Path: src/openenv/core/mcp_client.py Line: 139:148 Comment: duplicated code - these 3 identical blocks should be removed ```suggestion result = self.step(ListToolsAction()) self._tools_cache = result.observation.tools return self._tools_cache ``` How can I resolve this? If you propose a fix, please make it concise.

Removes accidentally duplicated lines 143-148 in list_tools() method.

burtenshaw

Nice works again. Just some small queries. In general, I think we could make this code a bit easier to understand for users. For example, I think the user will a simple overview of how these methods play together. For example, a schema like this could help:

┌─────────────────────────────────────────────────────────┐
│                    HTTPEnvServer                        │
├─────────────────────────────────────────────────────────┤
│  Simulation Mode:                                       │
│    /ws    → OpenEnv protocol (reset/step/state)         │
│    /mcp   → MCP JSON-RPC (tools/list, tools/call)       │
│    /reset, /step, /state → HTTP (enabled)               │
├─────────────────────────────────────────────────────────┤
│  Production Mode:                                       │
│    /ws    → OpenEnv protocol (disabled/blocked)         │
│    /mcp   → MCP JSON-RPC (enabled)                      │
│    /reset, /step, /state → HTTP (disabled)              │
└─────────────────────────────────────────────────────────┘

Client Classes:
  GenericEnvClient  → /ws (simulation only)
  MCPToolClient     → /mcp (production only)

Also, I think we might want to rename production to something like mcp_only to make it more obvious.

burtenshaw · 2026-01-31T13:46:13Z

src/openenv/core/env_server/http_server.py

+                }
+
+            # Create a temporary environment for MCP access
+            _env = self._env_factory()


Again, is there a way to reuse an mcp environment instead of recreating every time?

burtenshaw · 2026-01-31T13:50:18Z

src/openenv/core/env_server/mcp_environment.py

+
+        return callables
+
+    def execute_code(self, code: str) -> Observation:


Should we break up these methods to have a function that deal with the details of executing code and return erorrs nicely. This should impact learning too. For example, unsloth have some nice examples here.

burtenshaw · 2026-01-31T13:56:20Z

src/openenv/core/env_server/mcp_environment.py

+                            name=action.tool_name, arguments=action.arguments
+                        )
+                        break
+                    except:


This might be hard to debug. We could update a string with errors and log in a finally if result is None.

burtenshaw · 2026-01-31T13:58:57Z

src/openenv/core/env_server/mcp_environment.py

+            Tool result.
+        """
+        # For mocks, call directly
+        if (


Does this logic not require parenthese to open?

if ( hasattr(server, "list_tools") and (not hasattr(type(server), "__name__") or "FastMCP" not in type(server).__name__) ):

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 31, 2026

Darktex force-pushed the feature/issue-347-mcp-code-mode branch from ae04170 to eb52a6f Compare January 31, 2026 01:02

greptile-apps bot reviewed Jan 31, 2026

View reviewed changes

Darktex requested a review from burtenshaw January 31, 2026 01:06

fix(mcp): Remove duplicated code in mcp_client.py

c36a37d

Removes accidentally duplicated lines 143-148 in list_tools() method.

Darktex force-pushed the feature/issue-347-mcp-code-mode branch from db48f93 to c36a37d Compare January 31, 2026 01:13

burtenshaw reviewed Jan 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): Enable code mode for MCP environments #351

feat(mcp): Enable code mode for MCP environments #351

Uh oh!

Darktex commented Jan 31, 2026

Uh oh!

greptile-apps bot commented Jan 31, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jan 31, 2026

Uh oh!

burtenshaw left a comment

Uh oh!

burtenshaw Jan 31, 2026

Uh oh!

burtenshaw Jan 31, 2026

Uh oh!

burtenshaw Jan 31, 2026

Uh oh!

burtenshaw Jan 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		return callables

		def execute_code(self, code: str) -> Observation:

feat(mcp): Enable code mode for MCP environments #351

Are you sure you want to change the base?

feat(mcp): Enable code mode for MCP environments #351

Uh oh!

Conversation

Darktex commented Jan 31, 2026

Summary

Test plan

Uh oh!

greptile-apps bot commented Jan 31, 2026

Greptile Overview

Greptile Summary

Key Changes

Alignment Assessment

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

burtenshaw left a comment

Choose a reason for hiding this comment

Uh oh!

burtenshaw Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

burtenshaw Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

burtenshaw Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

burtenshaw Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants