Skip to content

Conversation

@Darktex
Copy link
Contributor

@Darktex Darktex commented Jan 31, 2026

Summary

Implements MCP code mode (CodeAct) per RFC 003 and GitHub issue #347. This enables agents to write Python code blocks that directly call environment tools as functions, avoiding the double-marshaling overhead (Python → JSON-RPC → Python) that occurs with local MCP servers when using tool-calling mode.

  • Add execute_code() method to MCPEnvironment for executing Python code with tools injected as direct callables
  • Add get_callables() method for extracting Python functions from FastMCP servers
  • Add HTTP /mcp endpoint for production mode inference (bypasses step() API)
  • Add use_production_mode option to MCPToolClient for direct MCP endpoint access
  • Support mixed local (direct Python call) and remote (JSON-RPC) MCP servers

Test plan

  • All 94 core tests pass
  • New tests for code mode selection (16 tests)
  • New tests for production mode routes (21 tests)
  • Existing MCP integration tests pass
  • Lint check passes

Closes #347

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 31, 2026
Implements CodeAct mode per RFC 003, allowing agents to write Python
code blocks that directly call environment tools as functions.

Key changes:
- Add execute_code() method to MCPEnvironment for code execution
- Add get_callables() for direct callable extraction from FastMCP servers
- Add HTTP /mcp endpoint for production mode (bypasses step() API)
- Add use_production_mode option to MCPToolClient
- Support mixed local (direct call) and remote (JSON-RPC) servers

This avoids double-marshaling (Python → JSON-RPC → Python) for local
MCP servers when using code mode, improving performance.
@Darktex Darktex force-pushed the feature/issue-347-mcp-code-mode branch from ae04170 to eb52a6f Compare January 31, 2026 01:02
@greptile-apps
Copy link

greptile-apps bot commented Jan 31, 2026

Greptile Overview

Greptile Summary

This PR implements MCP code mode (CodeAct) per RFC 003, enabling agents to write Python code blocks that directly call environment tools as functions, avoiding double-marshaling overhead for local MCP servers.

Key Changes

  • Code Mode Support: Added execute_code() method to MCPEnvironment that executes Python code with tools injected as direct callables
  • Local vs Remote Detection: get_callables() method extracts Python functions from FastMCP servers for direct invocation (local) or wraps them in JSON-RPC calls (remote)
  • Production Mode Endpoint: New HTTP POST /mcp endpoint provides direct MCP access bypassing step() API overhead (for inference/production use cases)
  • Client Support: MCPToolClient now supports use_production_mode option to access /mcp endpoint directly
  • Comprehensive Testing: 16 new tests for code mode selection, 21 new tests for production mode routes

Alignment Assessment

This implementation correctly follows RFC 003's design for code mode and maintains the dual API boundary:

  • Agent-facing MCP tools remain isolated from infrastructure controls (reset, step, state)
  • Production mode /mcp endpoint properly creates temporary environment instances with cleanup
  • Reward computation remains inside environment boundary (only called via step(), not production mode)
  • No violations of system invariants detected

Confidence Score: 4/5

  • Safe to merge after fixing the duplicated code in mcp_client.py
  • Implementation aligns with RFC 003, maintains system invariants, and has comprehensive test coverage. One syntax error (duplicated code) needs fixing before merge. No alignment concerns identified.
  • src/openenv/core/mcp_client.py requires attention - contains duplicated code at lines 139-148 that must be removed

Important Files Changed

Filename Overview
src/openenv/core/mcp_client.py added use_production_mode option and HTTP /mcp endpoint support, but contains duplicated code (lines 139-148)
src/openenv/core/env_server/mcp_environment.py implemented execute_code() and get_callables() for CodeAct mode with proper local/remote server detection
src/openenv/core/env_server/http_server.py added HTTP /mcp endpoint for production mode with proper JSON-RPC handling and environment cleanup

Sequence Diagram

sequenceDiagram
    participant Agent
    participant InfraClient as Infrastructure<br/>(EnvClient/MCPToolClient)
    participant HTTPServer as HTTP Server<br/>(http_server.py)
    participant MCPEnv as MCP Environment<br/>(mcp_environment.py)
    participant MCPServer as MCP Server<br/>(FastMCP)

    Note over Agent,MCPServer: Production Mode: Direct MCP Access (bypasses step API)
    Agent->>HTTPServer: POST /mcp<br/>{"jsonrpc":"2.0", "method":"tools/list"}
    HTTPServer->>MCPEnv: _env_factory()
    HTTPServer->>MCPEnv: list_tools()
    MCPEnv->>MCPServer: list_tools()
    MCPServer-->>MCPEnv: [tools]
    MCPEnv-->>HTTPServer: [tools]
    HTTPServer->>MCPEnv: close()
    HTTPServer-->>Agent: {"result": {"tools": [...]}}

    Note over Agent,MCPServer: Tool-Calling Mode (traditional MCP)
    InfraClient->>HTTPServer: POST /step<br/>CallToolAction("add", {a:1, b:2})
    HTTPServer->>MCPEnv: step(CallToolAction)
    MCPEnv->>MCPEnv: _handle_call_tool()
    MCPEnv->>MCPServer: call_tool("add", {a:1, b:2})
    MCPServer-->>MCPEnv: result=3
    MCPEnv->>MCPEnv: _compute_reward()
    MCPEnv-->>HTTPServer: CallToolObservation(result=3, reward=0.0)
    HTTPServer-->>InfraClient: StepResponse

    Note over Agent,MCPServer: Code Mode (CodeAct pattern - local optimization)
    InfraClient->>HTTPServer: POST /step<br/>CodeAction("result = add(1, 2)")
    HTTPServer->>MCPEnv: step(CodeAction)
    MCPEnv->>MCPEnv: execute_code()
    MCPEnv->>MCPEnv: get_callables()
    Note over MCPEnv: Local: Direct Python call<br/>Remote: JSON-RPC wrapper
    MCPEnv->>MCPServer: Direct fn call (local)<br/>OR JSON-RPC (remote)
    MCPServer-->>MCPEnv: result=3
    MCPEnv->>MCPEnv: _compute_reward()
    MCPEnv-->>HTTPServer: Observation(metadata={"result": 3}, reward=0.0)
    HTTPServer-->>InfraClient: StepResponse
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 139 to 148
result = self.step(ListToolsAction())
self._tools_cache = result.observation.tools
return self._tools_cache

result = self.step(ListToolsAction())
self._tools_cache = result.observation.tools
return self._tools_cache
result = self.step(ListToolsAction())
self._tools_cache = result.observation.tools
return self._tools_cache
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated code - these 3 identical blocks should be removed

Suggested change
result = self.step(ListToolsAction())
self._tools_cache = result.observation.tools
return self._tools_cache
result = self.step(ListToolsAction())
self._tools_cache = result.observation.tools
return self._tools_cache
result = self.step(ListToolsAction())
self._tools_cache = result.observation.tools
return self._tools_cache
result = self.step(ListToolsAction())
self._tools_cache = result.observation.tools
return self._tools_cache
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/openenv/core/mcp_client.py
Line: 139:148

Comment:
duplicated code - these 3 identical blocks should be removed

```suggestion
        result = self.step(ListToolsAction())
        self._tools_cache = result.observation.tools
        return self._tools_cache
```

How can I resolve this? If you propose a fix, please make it concise.

@Darktex Darktex requested a review from burtenshaw January 31, 2026 01:06
Removes accidentally duplicated lines 143-148 in list_tools() method.
@Darktex Darktex force-pushed the feature/issue-347-mcp-code-mode branch from db48f93 to c36a37d Compare January 31, 2026 01:13
Copy link
Collaborator

@burtenshaw burtenshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice works again. Just some small queries. In general, I think we could make this code a bit easier to understand for users. For example, I think the user will a simple overview of how these methods play together. For example, a schema like this could help:

┌─────────────────────────────────────────────────────────┐
│                    HTTPEnvServer                        │
├─────────────────────────────────────────────────────────┤
│  Simulation Mode:                                       │
│    /ws    → OpenEnv protocol (reset/step/state)         │
│    /mcp   → MCP JSON-RPC (tools/list, tools/call)       │
│    /reset, /step, /state → HTTP (enabled)               │
├─────────────────────────────────────────────────────────┤
│  Production Mode:                                       │
│    /ws    → OpenEnv protocol (disabled/blocked)         │
│    /mcp   → MCP JSON-RPC (enabled)                      │
│    /reset, /step, /state → HTTP (disabled)              │
└─────────────────────────────────────────────────────────┘

Client Classes:
  GenericEnvClient  → /ws (simulation only)
  MCPToolClient     → /mcp (production only)

Also, I think we might want to rename production to something like mcp_only to make it more obvious.

}

# Create a temporary environment for MCP access
_env = self._env_factory()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, is there a way to reuse an mcp environment instead of recreating every time?


return callables

def execute_code(self, code: str) -> Observation:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we break up these methods to have a function that deal with the details of executing code and return erorrs nicely. This should impact learning too. For example, unsloth have some nice examples here.

name=action.tool_name, arguments=action.arguments
)
break
except:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be hard to debug. We could update a string with errors and log in a finally if result is None.

Tool result.
"""
# For mocks, call directly
if (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this logic not require parenthese to open?

if (
    hasattr(server, "list_tools")
    and (not hasattr(type(server), "__name__") or "FastMCP" not in type(server).__name__)
):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[MCP] Enable "code mode"

3 participants