-
Notifications
You must be signed in to change notification settings - Fork 174
feat(mcp): Enable code mode for MCP environments #351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements CodeAct mode per RFC 003, allowing agents to write Python code blocks that directly call environment tools as functions. Key changes: - Add execute_code() method to MCPEnvironment for code execution - Add get_callables() for direct callable extraction from FastMCP servers - Add HTTP /mcp endpoint for production mode (bypasses step() API) - Add use_production_mode option to MCPToolClient - Support mixed local (direct call) and remote (JSON-RPC) servers This avoids double-marshaling (Python → JSON-RPC → Python) for local MCP servers when using code mode, improving performance.
ae04170 to
eb52a6f
Compare
Greptile OverviewGreptile SummaryThis PR implements MCP code mode (CodeAct) per RFC 003, enabling agents to write Python code blocks that directly call environment tools as functions, avoiding double-marshaling overhead for local MCP servers. Key Changes
Alignment AssessmentThis implementation correctly follows RFC 003's design for code mode and maintains the dual API boundary:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Agent
participant InfraClient as Infrastructure<br/>(EnvClient/MCPToolClient)
participant HTTPServer as HTTP Server<br/>(http_server.py)
participant MCPEnv as MCP Environment<br/>(mcp_environment.py)
participant MCPServer as MCP Server<br/>(FastMCP)
Note over Agent,MCPServer: Production Mode: Direct MCP Access (bypasses step API)
Agent->>HTTPServer: POST /mcp<br/>{"jsonrpc":"2.0", "method":"tools/list"}
HTTPServer->>MCPEnv: _env_factory()
HTTPServer->>MCPEnv: list_tools()
MCPEnv->>MCPServer: list_tools()
MCPServer-->>MCPEnv: [tools]
MCPEnv-->>HTTPServer: [tools]
HTTPServer->>MCPEnv: close()
HTTPServer-->>Agent: {"result": {"tools": [...]}}
Note over Agent,MCPServer: Tool-Calling Mode (traditional MCP)
InfraClient->>HTTPServer: POST /step<br/>CallToolAction("add", {a:1, b:2})
HTTPServer->>MCPEnv: step(CallToolAction)
MCPEnv->>MCPEnv: _handle_call_tool()
MCPEnv->>MCPServer: call_tool("add", {a:1, b:2})
MCPServer-->>MCPEnv: result=3
MCPEnv->>MCPEnv: _compute_reward()
MCPEnv-->>HTTPServer: CallToolObservation(result=3, reward=0.0)
HTTPServer-->>InfraClient: StepResponse
Note over Agent,MCPServer: Code Mode (CodeAct pattern - local optimization)
InfraClient->>HTTPServer: POST /step<br/>CodeAction("result = add(1, 2)")
HTTPServer->>MCPEnv: step(CodeAction)
MCPEnv->>MCPEnv: execute_code()
MCPEnv->>MCPEnv: get_callables()
Note over MCPEnv: Local: Direct Python call<br/>Remote: JSON-RPC wrapper
MCPEnv->>MCPServer: Direct fn call (local)<br/>OR JSON-RPC (remote)
MCPServer-->>MCPEnv: result=3
MCPEnv->>MCPEnv: _compute_reward()
MCPEnv-->>HTTPServer: Observation(metadata={"result": 3}, reward=0.0)
HTTPServer-->>InfraClient: StepResponse
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 files reviewed, 1 comment
src/openenv/core/mcp_client.py
Outdated
| result = self.step(ListToolsAction()) | ||
| self._tools_cache = result.observation.tools | ||
| return self._tools_cache | ||
|
|
||
| result = self.step(ListToolsAction()) | ||
| self._tools_cache = result.observation.tools | ||
| return self._tools_cache | ||
| result = self.step(ListToolsAction()) | ||
| self._tools_cache = result.observation.tools | ||
| return self._tools_cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicated code - these 3 identical blocks should be removed
| result = self.step(ListToolsAction()) | |
| self._tools_cache = result.observation.tools | |
| return self._tools_cache | |
| result = self.step(ListToolsAction()) | |
| self._tools_cache = result.observation.tools | |
| return self._tools_cache | |
| result = self.step(ListToolsAction()) | |
| self._tools_cache = result.observation.tools | |
| return self._tools_cache | |
| result = self.step(ListToolsAction()) | |
| self._tools_cache = result.observation.tools | |
| return self._tools_cache |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/openenv/core/mcp_client.py
Line: 139:148
Comment:
duplicated code - these 3 identical blocks should be removed
```suggestion
result = self.step(ListToolsAction())
self._tools_cache = result.observation.tools
return self._tools_cache
```
How can I resolve this? If you propose a fix, please make it concise.Removes accidentally duplicated lines 143-148 in list_tools() method.
db48f93 to
c36a37d
Compare
burtenshaw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice works again. Just some small queries. In general, I think we could make this code a bit easier to understand for users. For example, I think the user will a simple overview of how these methods play together. For example, a schema like this could help:
┌─────────────────────────────────────────────────────────┐
│ HTTPEnvServer │
├─────────────────────────────────────────────────────────┤
│ Simulation Mode: │
│ /ws → OpenEnv protocol (reset/step/state) │
│ /mcp → MCP JSON-RPC (tools/list, tools/call) │
│ /reset, /step, /state → HTTP (enabled) │
├─────────────────────────────────────────────────────────┤
│ Production Mode: │
│ /ws → OpenEnv protocol (disabled/blocked) │
│ /mcp → MCP JSON-RPC (enabled) │
│ /reset, /step, /state → HTTP (disabled) │
└─────────────────────────────────────────────────────────┘
Client Classes:
GenericEnvClient → /ws (simulation only)
MCPToolClient → /mcp (production only)
Also, I think we might want to rename production to something like mcp_only to make it more obvious.
| } | ||
|
|
||
| # Create a temporary environment for MCP access | ||
| _env = self._env_factory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, is there a way to reuse an mcp environment instead of recreating every time?
|
|
||
| return callables | ||
|
|
||
| def execute_code(self, code: str) -> Observation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we break up these methods to have a function that deal with the details of executing code and return erorrs nicely. This should impact learning too. For example, unsloth have some nice examples here.
| name=action.tool_name, arguments=action.arguments | ||
| ) | ||
| break | ||
| except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be hard to debug. We could update a string with errors and log in a finally if result is None.
| Tool result. | ||
| """ | ||
| # For mocks, call directly | ||
| if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this logic not require parenthese to open?
if (
hasattr(server, "list_tools")
and (not hasattr(type(server), "__name__") or "FastMCP" not in type(server).__name__)
):
Summary
Implements MCP code mode (CodeAct) per RFC 003 and GitHub issue #347. This enables agents to write Python code blocks that directly call environment tools as functions, avoiding the double-marshaling overhead (Python → JSON-RPC → Python) that occurs with local MCP servers when using tool-calling mode.
execute_code()method toMCPEnvironmentfor executing Python code with tools injected as direct callablesget_callables()method for extracting Python functions from FastMCP servers/mcpendpoint for production mode inference (bypasses step() API)use_production_modeoption toMCPToolClientfor direct MCP endpoint accessTest plan
Closes #347