Skip to content

Commit 616e24f

Browse files
MANSURA HABIBAMANSURA HABIBA
authored andcommitted
fix(mcp): make run_tool wait_for timeout configurable, raise floor to 180s
The hardcoded `timeout=30.0` on `asyncio.wait_for(session.call_tool(...))` in both `MCPSseClient.run_tool` and `MCPStreamableHttpClient.run_tool` was truncating in-flight tool calls whose servers needed longer than 30s to respond — most visibly large MCP composers whose tool responses include sizeable catalogs. Failure mode: LLM → AgentExecutor → run_tool (1 call from the LLM) ├── attempt 1: get_session (probe ..._1 fails → swap ..._2) │ → call_tool → 30s wait_for fires, TimeoutError raised │ (server is still working — response ignored on arrival) └── attempt 2: get_session (probe ..._2 fails → swap ..._3) → call_tool → 30s wait_for fires again → util.py dedup ("Repeated TimeoutError, not retrying") → ValueError("Maximum retries exceeded ...") ← AgentExecutor surfaces error → flow fails The retry loop's TimeoutError dedup correctly stops looping once it sees the same error twice, but every call against a slow server hit the same ceiling and never had a chance to succeed. The session swaps between attempts also produced excess HTTP churn against the server. Both `run_tool` methods now resolve the timeout from `get_settings_service().settings.mcp_server_timeout` (already used for connection setup at lines 1617 / 1887, env: `LANGFLOW_MCP_SERVER_TIMEOUT`) with a `max(setting, 180.0)` floor so default deployments don't end up *shorter* than the previous hardcoded 30s — the Pydantic default for `mcp_server_timeout` is 20. Touched: src/lfx/src/lfx/base/mcp/util.py - L1677, L1689 (stdio run_tool) - L1960, L1972 (streamable HTTP run_tool) Unchanged on purpose: - L1217, L1381 session-creation `timeout=30.0` (separate concern). - L1060 `_validate_session_connectivity` 3.0s probe (separate concern; addressed in follow-up if probe thrash persists). Tuning: - LANGFLOW_MCP_SERVER_TIMEOUT now governs both connection setup and tool-call wait_for. The 180s floor clamps low values; raise the env var above 180 to take effect. Repro / verification: - Trace run that previously failed at 67.08s with "Maximum retries exceeded with repeated TimeoutError errors" against `mcp-guardium_get_service_info` (47KB response payload) now completes on attempt 1.
1 parent 4b46b57 commit 616e24f

1 file changed

Lines changed: 8 additions & 2 deletions

File tree

src/lfx/src/lfx/base/mcp/util.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1672,6 +1672,9 @@ async def run_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
16721672
param_hash = uuid.uuid4().hex[:8]
16731673
self._session_context = f"default_{param_hash}"
16741674

1675+
# Tool-call timeout: env LANGFLOW_MCP_SERVER_TIMEOUT (via settings), with a 180s
1676+
# floor so default deployments aren't shorter than the previous hardcoded 30s.
1677+
timeout = max(get_settings_service().settings.mcp_server_timeout, 180.0)
16751678
max_retries = 2
16761679
last_error_type = None
16771680

@@ -1683,7 +1686,7 @@ async def run_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
16831686

16841687
result = await asyncio.wait_for(
16851688
session.call_tool(tool_name, arguments=arguments),
1686-
timeout=30.0, # 30 second timeout
1689+
timeout=timeout,
16871690
)
16881691
except Exception as e:
16891692
current_error_type = type(e).__name__
@@ -1952,6 +1955,9 @@ async def run_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
19521955
param_hash = uuid.uuid4().hex[:8]
19531956
self._session_context = f"default_http_{param_hash}"
19541957

1958+
# Tool-call timeout: env LANGFLOW_MCP_SERVER_TIMEOUT (via settings), with a 180s
1959+
# floor so default deployments aren't shorter than the previous hardcoded 30s.
1960+
timeout = max(get_settings_service().settings.mcp_server_timeout, 180.0)
19551961
max_retries = 2
19561962
last_error_type = None
19571963

@@ -1963,7 +1969,7 @@ async def run_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
19631969

19641970
result = await asyncio.wait_for(
19651971
session.call_tool(tool_name, arguments=arguments),
1966-
timeout=30.0, # 30 second timeout
1972+
timeout=timeout,
19671973
)
19681974
except Exception as e:
19691975
current_error_type = type(e).__name__

0 commit comments

Comments
 (0)