Skip to content

Conversation

@vblagoje
Copy link
Member

@vblagoje vblagoje commented Jan 9, 2026

Why

Agents need tools that seamlessly share data through state. MCPTool already supports inputs_from_state and outputs_to_state for automatic parameter injection and output handling. MCPToolset couldn't use these features - breaking Agent workflows that rely on state propagation across multiple MCP tools.

This PR completes MCP's Agent integration story. MCPToolset now dynamically configures state mappings for each discovered tool, enabling the same seamless state-based coordination that MCPTool provides.

What

Added per-tool state configuration to MCPToolset:

  • Added inputs_from_state, outputs_to_state, outputs_to_string parameters (dict mapping tool names to configs)
  • _serialize_state_config() / _deserialize_state_config() helpers handle both nested (outputs_to_state) and flat (outputs_to_string) formats
  • _validate_state_configs() warns on unknown tool names
  • ValueError from Tool parameter validation bubbles up (Haystack >=2.22.0)
  • Updated dependency to haystack-ai>=2.19.0 (supports state params)

How can it be used

from haystack_integrations.tools.mcp import MCPToolset, StdioServerInfo

# Create toolset with per-tool state configuration
toolset = MCPToolset(
    server_info=StdioServerInfo(command="uvx", args=["mcp-server-git"]),
    tool_names=["git_status", "git_diff", "git_log"],
    
    # Map state keys to tool parameters
    inputs_from_state={
        "git_status": {"repository": "repo_path"},
        "git_diff": {"repository": "repo_path"},
        "git_log": {"repository": "repo_path"},
    },
    
    # Map tool outputs to state keys
    outputs_to_state={
        "git_status": {"status_result": {"source": "status"}},
        "git_diff": {"diff_result": {}},
    },
    
    # Convert outputs to strings for LLM
    outputs_to_string={
        "git_diff": {"source": "diff", "handler": format_diff},
    },
)

How did you test it

  • 3 unit tests: state config creation, serialization roundtrip, unknown tool warning
  • 1 validation test: ValueError on invalid parameter (skipped on Haystack <2.22.0)
  • 6 helper tests: serialize/deserialize for both outputs_to_state and outputs_to_string formats

Notes for the reviewer

Parameter validation behavior:

  • Haystack >=2.22.0: ValueError raised at tool creation time for invalid parameter names
  • Haystack 2.19.0-2.21.x: Invalid parameters fail at runtime during tool invocation

@github-actions github-actions bot added integration:mcp type:documentation Improvements or additions to documentation labels Jan 9, 2026
@vblagoje vblagoje marked this pull request as ready for review January 9, 2026 13:56
@vblagoje vblagoje requested a review from a team as a code owner January 9, 2026 13:56
@vblagoje vblagoje requested review from mpangrazzi and sjrl and removed request for a team January 9, 2026 13:56
# Map tool outputs to state keys
outputs_to_state={
"git_status": {"status_result": {"source": "status"}},
"git_diff": {"diff_result": {}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a dev comment here explaining why one might provide an empty dict here?

@sjrl
Copy link
Contributor

sjrl commented Jan 12, 2026

@vblagoje looking good! I wanted to ask if you could add an integration test using these configs to see if everything works as expected?

Copy link
Contributor

@mpangrazzi mpangrazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks already good! I've added a few comments. +1 on adding an integration test as @sjrl mentioned.

:param config: The state configuration dictionary to serialize
:returns: The serialized configuration dictionary, or None if empty
"""
if not config:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More a type checker thing: the function accepts and handles None correctly, but the type hint says it only accepts dict.


# Check if this is outputs_to_string format (flat with optional source/handler)
# or outputs_to_state format (nested with state keys)
if "source" in tool_config or "handler" in tool_config:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edge case: what if someone use source or handler as keys in state? May worth documenting?

assert isinstance(toolset_dict["data"]["outputs_to_string"]["add"]["handler"], str)

# Test deserialization
with patch("haystack_integrations.tools.mcp.mcp_toolset.MCPToolset.__init__", return_value=None) as mock_init:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may miss something, but why not doing a full roundtrip of to_dict() / from_dict() here (so without mocking)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:mcp type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add inputs_from_state and outputs_to_state support to MCPToolset

4 participants