feat(mute-agent): port production Mute Agent infrastructure from internal#69
Conversation
…rnal repo New modules (4 packages, 21 files, ~4,800 lines): core/ (3 new files): - execution_agent.py (164L): Execution agent for governed tool invocations - reasoning_agent.py (236L): Reasoning agent with chain-of-thought governance - handshake_protocol.py (199L): Negotiation protocol between reasoning/execution knowledge_graph/ (4 files): - graph_elements.py (63L): Node/Edge/Fact primitives - multidimensional_graph.py (168L): Multi-dimensional knowledge graph - subgraph.py (222L): Subgraph extraction and traversal listener/ (10 files): - listener.py (608L): Event-driven governance listener - state_observer.py (434L): Real-time agent state observation - threshold_config.py (311L): Configurable governance thresholds - adapters/: 5 layer adapters (base, caas, control_plane, iatp, scak) super_system/ (2 files): - router.py (202L): Super-system routing for multi-agent governance visualization/ (2 files): - graph_debugger.py (495L): Knowledge graph visualization and debugging All 112 existing tests pass, 0 regressions. Closes #62 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
There was a problem hiding this comment.
Pull request overview
Ports the production “Mute Agent” governance infrastructure into the public agent-os module implementation, adding a Listener-based observation/intervention loop, a multidimensional knowledge graph, routing, adapters, and visualization tooling.
Changes:
- Adds Layer 5 Listener Agent, StateObserver metrics collection, and configurable threshold-based interventions.
- Introduces multidimensional knowledge graph primitives + subgraph pruning and a SuperSystemRouter for action-space routing.
- Adds GraphDebugger utilities for generating HTML/PNG execution-trace visualizations.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 22 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/agent-os/modules/mute-agent/src/visualization/graph_debugger.py | New graph execution-trace visualization (HTML/PNG) and comparison views. |
| packages/agent-os/modules/mute-agent/src/visualization/init.py | Exposes visualization public API symbols. |
| packages/agent-os/modules/mute-agent/src/super_system/router.py | Context normalization + dimension routing and action-space intersection. |
| packages/agent-os/modules/mute-agent/src/super_system/init.py | Super-system package marker. |
| packages/agent-os/modules/mute-agent/src/listener/threshold_config.py | Threshold types, intervention levels, rules, and default configs. |
| packages/agent-os/modules/mute-agent/src/listener/state_observer.py | Metrics sampling, derived metrics, anomaly detection, graph snapshotting. |
| packages/agent-os/modules/mute-agent/src/listener/listener.py | Background observer loop + threshold evaluation + intervention wiring. |
| packages/agent-os/modules/mute-agent/src/listener/adapters/base_adapter.py | Base adapter abstraction + health checks for lower-layer integration. |
| packages/agent-os/modules/mute-agent/src/listener/adapters/scak_adapter.py | Mocked intelligence-layer adapter interface. |
| packages/agent-os/modules/mute-agent/src/listener/adapters/iatp_adapter.py | Mocked security/trust-layer adapter interface. |
| packages/agent-os/modules/mute-agent/src/listener/adapters/caas_adapter.py | Mocked context-layer adapter interface. |
| packages/agent-os/modules/mute-agent/src/listener/adapters/control_plane_adapter.py | Mocked control-plane adapter for orchestration/queues/lifecycle. |
| packages/agent-os/modules/mute-agent/src/listener/adapters/init.py | Exposes adapter symbols. |
| packages/agent-os/modules/mute-agent/src/listener/init.py | Exposes Listener public API symbols. |
| packages/agent-os/modules/mute-agent/src/knowledge_graph/graph_elements.py | Node/Edge primitives and enums. |
| packages/agent-os/modules/mute-agent/src/knowledge_graph/subgraph.py | Dimension/subgraph implementation + dependency traversal + pruning. |
| packages/agent-os/modules/mute-agent/src/knowledge_graph/multidimensional_graph.py | Multi-dimension graph container + pruning/validation helpers. |
| packages/agent-os/modules/mute-agent/src/knowledge_graph/init.py | Knowledge-graph package marker. |
| packages/agent-os/modules/mute-agent/src/core/handshake_protocol.py | Handshake state machine and session tracking for proposals/execution. |
| packages/agent-os/modules/mute-agent/src/core/reasoning_agent.py | Routing + proposal validation against graph constraints/dependencies. |
| packages/agent-os/modules/mute-agent/src/core/execution_agent.py | Executes accepted proposals via registered action handlers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if not VISUALIZATION_AVAILABLE: | ||
| print("Cannot generate visualization: libraries not installed") | ||
| return "" | ||
|
|
||
| if format == "html": | ||
| return self._generate_interactive_html(trace, output_path) | ||
| elif format == "png": | ||
| return self._generate_static_png(trace, output_path) | ||
| else: | ||
| raise ValueError(f"Unsupported format: {format}") | ||
|
|
There was a problem hiding this comment.
visualize_trace() only checks VISUALIZATION_AVAILABLE, but HTML output requires pyvis and PNG output requires matplotlib. In environments where only one of these optional deps is installed, calling the other format will raise at runtime (e.g., format='png' with no matplotlib). Consider validating per-format availability (or raising a clear error) before calling _generate_interactive_html / _generate_static_png.
| from dataclasses import dataclass, field | ||
| from typing import Dict, List, Optional, Set, Any | ||
| from enum import Enum | ||
| from datetime import datetime | ||
| import os | ||
|
|
There was a problem hiding this comment.
Unused imports here (e.g., os, Set, matplotlib.patches as mpatches) will fail ruff (F401) under the repo’s lint config. Please remove them or use them.
| # Perform intervention | ||
| if self.config.auto_intervention: | ||
| self._perform_intervention( | ||
| triggered_rules, | ||
| intervention_level, | ||
| observation, | ||
| ) | ||
|
|
||
| # Return to observation (or recovery) | ||
| if self._state == ListenerState.INTERVENING: | ||
| self._set_state(ListenerState.RECOVERING) | ||
| # In recovery mode, continue observation with heightened awareness | ||
| self._recovery_check() | ||
|
|
There was a problem hiding this comment.
With auto_intervention=False, the listener transitions to EVALUATING when thresholds trigger but never transitions back to OBSERVING/RECOVERING, leaving it stuck in EVALUATING indefinitely. Add an explicit state transition for the manual-intervention path (e.g., emit callbacks/events and then return to observing).
| if not context: | ||
| return True | ||
|
|
||
| # Check if any context keys match dimension metadata | ||
| for key in context.keys(): | ||
| if key in dimension.metadata: | ||
| return True | ||
|
|
||
| # If no specific metadata match, dimension is potentially relevant | ||
| return True |
There was a problem hiding this comment.
_dimension_matches_context() currently returns True even when no metadata keys match (the final unconditional return True). That makes find_relevant_dimensions() effectively select all dimensions whenever any dimensions exist, so routing/pruning won’t actually be context-sensitive. Consider implementing a real match predicate (or returning False when nothing matches) and relying on the caller’s fallback-to-all-dimensions behavior when appropriate.
| if not context: | |
| return True | |
| # Check if any context keys match dimension metadata | |
| for key in context.keys(): | |
| if key in dimension.metadata: | |
| return True | |
| # If no specific metadata match, dimension is potentially relevant | |
| return True | |
| # If there is no context, all dimensions are considered relevant. | |
| if not context: | |
| return True | |
| # If the dimension has no metadata, treat it as a generic dimension that is always relevant. | |
| if not getattr(dimension, "metadata", None): | |
| return True | |
| # Check if any context keys match dimension metadata | |
| for key in context.keys(): | |
| if key in dimension.metadata: | |
| return True | |
| # If no specific metadata matches, the dimension is not considered relevant for this context. | |
| return False |
| def update_agent_state( | ||
| self, | ||
| agent_id: str, | ||
| state: AgentLifecycleState | ||
| ) -> AgentInfo: | ||
| """Mock state update.""" | ||
| if agent_id in self._agents: | ||
| self._agents[agent_id].state = state | ||
| return self._agents[agent_id] | ||
| return None | ||
|
|
There was a problem hiding this comment.
MockControlPlaneClient.enqueue_action() is annotated to return str but can return None when the agent queue doesn’t exist, and update_agent_state() is annotated to return AgentInfo but can return None. Consider returning Optional[...] (and updating adapter method annotations accordingly) or raising a clear error to keep type/contracts consistent.
|
|
||
| from typing import Dict, Any, Optional, List | ||
| from dataclasses import dataclass | ||
| from datetime import datetime, timedelta |
There was a problem hiding this comment.
Unused import timedelta here will fail ruff F401. Remove it or use it.
| from datetime import datetime, timedelta | |
| from datetime import datetime |
| for dim_name, subgraph in self.knowledge_graph.subgraphs.items(): | ||
| snapshot["node_counts"][dim_name] = len(subgraph._nodes) | ||
| snapshot["edge_counts"][dim_name] = sum( | ||
| len(edges) for edges in subgraph._adjacency_list.values() | ||
| ) | ||
|
|
There was a problem hiding this comment.
_create_graph_snapshot() references subgraph._nodes, but Subgraph defines nodes (and no _nodes). This will raise AttributeError when context is provided to observe() and a snapshot is requested. Use len(subgraph.nodes) (or add a supported accessor) consistently.
| try: | ||
| self._run_observation_cycle() | ||
| except Exception as e: | ||
| # Log error but continue observation | ||
| # In production, this would integrate with logging framework | ||
| pass | ||
|
|
There was a problem hiding this comment.
_observation_loop() catches Exception as e but never uses e, which will fail ruff F841. Either drop the binding (except Exception:) or log/use the exception.
| Dimensional subgraph implementation. | ||
| """ | ||
|
|
||
| from typing import Dict, List, Set, Optional, Any |
There was a problem hiding this comment.
Unused import Set here will fail ruff F401. Remove it or use it.
| from typing import Dict, List, Set, Optional, Any | |
| from typing import Dict, List, Optional, Any |
| The Hands - The Execution Agent | ||
| """ | ||
|
|
||
| from typing import Dict, List, Optional, Any, Callable |
There was a problem hiding this comment.
Unused import Optional here will fail ruff F401. Remove it or use it.
| from typing import Dict, List, Optional, Any, Callable | |
| from typing import Dict, List, Any, Callable |
Summary
Ports the production Mute Agent governance infrastructure from the internal repo. The public version had demo agents and benchmarks; this adds the full governance listener, knowledge graph, and multi-layer adapter system (~4,800 lines).
New Packages
core/ (3 new files)
knowledge_graph/ (4 files)
listener/ (10 files)
super_system/ (2 files)
visualization/ (2 files)
Tests
Closes #62