Skip to content

Add PowershellToolsMixin for direct PowerShell execution in agents #333

@itomek

Description

@itomek

Summary

Add a PowershellToolsMixin class that provides direct PowerShell command execution tools to GAIA agents, bypassing the MCP server subprocess layer. Currently, PowerShell execution goes through the Windows MCP server (mcp_windows_Shell tool via MCPClientMixin), which adds subprocess + IPC overhead for every command invocation. A native mixin would eliminate this overhead and provide tighter integration with the agent tool system.

Motivation

The current approach for PowerShell execution routes commands through an MCP server:

Agent -> MCPClientMixin -> stdio transport -> MCP server subprocess -> PowerShell

This adds latency from JSON-RPC serialization, subprocess IPC, and MCP protocol handshakes on every tool call. A PowershellToolsMixin would provide a direct execution path:

Agent -> PowershellToolsMixin -> PowerShell (subprocess.run / subprocess.Popen)

This follows the same pattern established by ShellToolsMixin in src/gaia/agents/chat/tools/shell_tools.py and CLIToolsMixin in src/gaia/agents/code/tools/cli_tools.py, which execute shell commands directly via subprocess rather than going through an intermediary server.

Files to Reference

  • src/gaia/agents/base/tools.py - @tool decorator and _TOOL_REGISTRY for tool registration
  • src/gaia/agents/chat/tools/shell_tools.py - ShellToolsMixin pattern (rate limiting, whitelist, subprocess.run)
  • src/gaia/agents/code/tools/cli_tools.py - CLIToolsMixin pattern (background processes, error detection, timeout management)
  • src/gaia/mcp/mixin.py - MCPClientMixin (current MCP-based approach this aims to complement)
  • src/gaia/agents/base/agent.py - Base Agent class to integrate with
  • docs/plans/cua.mdx - Computer Use Agent roadmap (CUA will benefit from direct PowerShell tools)

Proposed Location

src/gaia/agents/base/powershell_tools.py (or src/gaia/agents/chat/tools/powershell_tools.py if scoped to ChatAgent initially)

Acceptance Criteria

Core Tools (registered via @tool decorator)

  • run_powershell_command - Execute a single PowerShell command string, returning stdout/stderr/return_code. Should support timeout parameter and working directory override.
  • run_powershell_script - Execute a multi-line PowerShell script (passed as string), with support for script parameters via -ArgumentList. Useful for complex operations that span multiple commands.
  • run_powershell_json - Execute a PowerShell command and automatically parse JSON output. Internally appends | ConvertTo-Json -Depth 10 (or similar) to the command, parses the result, and returns structured Python data. This is critical for system queries like Get-Process, Get-Service, Get-ComputerInfo, etc.

Integration Requirements

  • Follows the mixin pattern: class with __init__ calling super().__init__(*args, **kwargs) and a register_powershell_tools() method (matching ShellToolsMixin.register_shell_tools() and CLIToolsMixin.register_cli_tools() conventions)
  • Tools registered via the @tool decorator from src/gaia/agents/base/tools.py into _TOOL_REGISTRY
  • Compatible with the base Agent class via multiple inheritance (e.g., class MyAgent(Agent, PowershellToolsMixin))
  • Works alongside MCPClientMixin without conflicts (agents may use both)
  • CUA integration: The Computer Use Agent (docs/plans/cua.mdx) should be able to include this mixin for direct PowerShell execution alongside MCP-based desktop control tools

Error Handling & Safety

  • Timeout management with configurable default (e.g., 30 seconds)
  • Graceful handling of PowerShell not being available (non-Windows platforms should return a clear error, not crash)
  • Execution policy awareness: use -ExecutionPolicy Bypass for script execution or document the requirement
  • Proper encoding handling for PowerShell output (UTF-8 via -OutputEncoding)
  • Structured error output: capture PowerShell $Error / $LASTEXITCODE and map to GAIA's {"status": "error", "error": ..., "has_errors": True} pattern used by existing tools

Structured Output Parsing

  • ConvertTo-Json integration for automatic JSON parsing of PowerShell object output
  • Handle ConvertTo-Json depth limits and truncation gracefully
  • Support for ConvertTo-Csv and Format-List as alternative output parsers
  • Return raw string output when structured parsing is not requested or fails

Testing

  • Unit tests with mocked subprocess calls (runs on all platforms in CI)
  • Tests for each tool: run_powershell_command, run_powershell_script, run_powershell_json
  • Tests for error conditions: timeout, PowerShell not found, invalid script syntax, JSON parse failure
  • Tests for mixin integration: verify tools appear in _TOOL_REGISTRY after register_powershell_tools()
  • Platform guard tests: verify graceful behavior on non-Windows (Linux/macOS CI)

Technical Notes

PowerShell Invocation Pattern

The mixin should invoke PowerShell via subprocess.run or subprocess.Popen with:

["powershell.exe", "-NoProfile", "-NonInteractive", "-Command", command_string]

Or for scripts:

["powershell.exe", "-NoProfile", "-NonInteractive", "-ExecutionPolicy", "Bypass", "-File", script_path]

For cross-platform compatibility (PowerShell Core), also support pwsh as an alternative binary.

Structured Output Example

# Agent calls run_powershell_json("Get-Process | Select-Object Name, CPU, WorkingSet")
# Internally executes:
#   powershell.exe -NoProfile -NonInteractive -Command "Get-Process | Select-Object Name, CPU, WorkingSet | ConvertTo-Json -Depth 5"
# Returns parsed Python list of dicts

Comparison with MCP Approach

Aspect MCPClientMixin (current) PowershellToolsMixin (proposed)
Latency per call ~100-200ms (IPC + JSON-RPC) ~10-50ms (direct subprocess)
Setup overhead MCP server process must be running None (direct invocation)
Structured output Manual parsing Built-in ConvertTo-Json support
Cross-platform Depends on MCP server Falls back gracefully
Tool discovery Dynamic from MCP server Static, registered at init

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions