docs: add Google Colab notebooks for zero-friction trial by kanish5 · Pull Request #591 · microsoft/agent-governance-toolkit

kanish5 · 2026-03-30T07:00:05Z

Closes #255

Summary

Added 3 Google Colab notebooks for zero-friction trial experience:

notebooks/01_policy_enforcement_101.ipynb — define capabilities, evaluate actions, see violations blocked with audit trail
notebooks/02_mcp_security_proxy.ipynb — detect MCP tool poisoning patterns, trust gate simulation
notebooks/03_multi_agent_governance.ipynb — SLOs, circuit breakers, chaos testing across agent fleet

Details

Each notebook includes an Open in Colab badge
All notebooks run fully offline — no API key required
Built using real examples from the existing codebase (langchain_governed.py, mcp-security.yaml)

github-actions · 2026-03-30T07:00:15Z

Welcome to the Agent Governance Toolkit! Thanks for your first pull request.
Please ensure tests pass, code follows style (ruff check), and you have signed the CLA.
See our Contributing Guide.

kanish5 · 2026-03-30T07:00:27Z

@microsoft-github-policy-service agree

github-actions

🤖 AI Agent: code-reviewer

Review Summary

This pull request introduces three Google Colab notebooks to provide an interactive, zero-friction trial experience for the microsoft/agent-governance-toolkit. The notebooks cover policy enforcement, MCP security proxy, and multi-agent governance. While the notebooks are well-structured and provide clear instructions, there are several areas that require attention to ensure security, correctness, and maintainability.

🔴 CRITICAL

Insufficient Input Validation in MCP Security Scanner
- The scan_tool_definition function uses regular expressions to detect malicious patterns in tool descriptions. However, the regex patterns are not comprehensive and may lead to false negatives, allowing malicious tool definitions to bypass the trust gate.
- Recommendation:
  - Expand the detection patterns to cover additional attack vectors, such as obfuscated commands or encoded payloads.
  - Consider integrating a more robust static analysis tool or library for detecting malicious patterns.
Potential for Sandbox Escape in MCP Tool Definitions
- The scan_tool_definition function does not account for encoded or obfuscated payloads (e.g., Base64-encoded commands). This could allow attackers to bypass the trust gate by encoding malicious commands.
- Recommendation: Add decoding and normalization steps to the scanner to handle obfuscated payloads. For example, decode Base64 strings and scan the decoded content.
Improper Handling of Circuit Breaker State
- The check_circuit_breaker function opens the circuit if the error rate or latency exceeds the SLO thresholds. However, there is no mechanism to reset the circuit after the issue is resolved, potentially leading to a denial of service.
- Recommendation: Implement a circuit breaker reset mechanism, such as a cooldown period or a sliding window for error rate and latency calculations.
Thread Safety Concerns
- The AgentMetrics class is not thread-safe, and concurrent updates to metrics (e.g., total_calls, errors, total_latency_ms) could lead to race conditions in multi-threaded environments.
- Recommendation: Use thread-safe data structures (e.g., threading.Lock) or atomic operations to ensure thread safety.

🟡 WARNING

Backward Compatibility Risk
- The notebooks introduce new functionality and examples but do not explicitly test backward compatibility with existing APIs or configurations.
- Recommendation: Add automated tests to ensure that the new notebooks do not introduce breaking changes to the existing API or functionality.
Hardcoded Detection Patterns
- The detection patterns in the scan_tool_definition function are hardcoded, making it difficult to update or extend them without modifying the code.
- Recommendation: Externalize the detection patterns into a configuration file (e.g., YAML or JSON) and load them dynamically. This will make it easier to update patterns without code changes.

💡 SUGGESTIONS

Improve Documentation
- While the notebooks are well-documented, consider adding more context about the scenarios being simulated. For example, explain the significance of MCP tool poisoning and how it relates to real-world threats.
- Include a section on how users can customize the detection patterns and SLOs to suit their specific use cases.
Add Unit Tests for Notebooks
- The functionality demonstrated in the notebooks (e.g., scan_tool_definition, check_circuit_breaker) should be covered by unit tests to ensure correctness and prevent regressions.
- Recommendation: Extract reusable functions from the notebooks into a separate module and write unit tests for them using pytest.
Enhance Visualization
- The audit trail and health dashboard outputs are text-based, which may not be user-friendly for all users.
- Recommendation: Use visualization libraries like matplotlib or plotly to create graphical representations of the audit trail and agent health metrics.
Optimize Performance
- The scan_tool_definition function iterates over all patterns for each category, which could become a bottleneck with a large number of patterns or tool definitions.
- Recommendation: Optimize the pattern matching logic, possibly by compiling regex patterns in advance or using a more efficient matching algorithm.
Add Security Tests
- The notebooks introduce security-critical functionality (e.g., MCP security scanner, trust gate, circuit breaker). However, there are no tests to validate their effectiveness against known attack vectors.
- Recommendation: Create a suite of security tests to validate the detection patterns and circuit breaker logic against a variety of malicious inputs.

Additional Notes

The addition of the Open in Colab badges is a great touch for improving accessibility and user experience.
The notebooks are well-structured and provide clear step-by-step instructions, making them suitable for users who are new to the toolkit.
The use of dataclass for AgentSLO and AgentMetrics is a good practice for maintaining type safety and readability.

Action Items

Address the CRITICAL issues related to input validation, sandbox escape, circuit breaker state, and thread safety.
Mitigate the WARNING about backward compatibility and hardcoded detection patterns.
Implement the SUGGESTIONS to improve documentation, testing, visualization, and performance.

Let me know if you need further clarification or assistance!

github-actions · 2026-03-30T07:00:37Z

🤖 AI Agent: security-scanner — Security Review of Pull Request: Google Colab Notebooks for Zero-Friction Trial

Security Review of Pull Request: Google Colab Notebooks for Zero-Friction Trial

Findings:

1. Prompt Injection Defense Bypass

Severity: 🔴 CRITICAL
Issue:
The GovernancePolicy defined in 01_policy_enforcement_101.ipynb uses simple string matching for blocked patterns (blocked_patterns). This approach is vulnerable to prompt injection attacks where malicious input circumvents the policy by using obfuscation techniques such as encoding, spacing, or synonyms. For example:

"DROP TABLE" could be bypassed with "D R O P T A B L E" or "DROP\tTABLE".
"rm -rf" could be bypassed with "rm -rf" encoded in Base64 or other formats.

Attack Vector:
An attacker could craft input that bypasses the blocked patterns, allowing dangerous commands or sensitive data leaks to pass through the governance layer.

Recommendation:

Replace simple string matching with robust pattern detection using regular expressions that account for obfuscation techniques (e.g., spacing, encoding).
Integrate a semantic analysis layer to detect intent rather than relying solely on string patterns.
Add unit tests for various obfuscation techniques to ensure the policy cannot be bypassed.

2. Policy Engine Circumvention

Severity: 🟠 HIGH
Issue:
The require_human_approval flag in the GovernancePolicy is set to False in the notebook. This configuration allows actions to proceed without human oversight, even if flagged as potentially dangerous. Additionally, the max_tool_calls enforcement is implemented in the notebook (ctx.call_count) but lacks thread-safety, making it vulnerable to race conditions.

Attack Vector:
An attacker could exploit concurrent calls to exceed the max_tool_calls limit or bypass human approval checks by manipulating the context state.

Recommendation:

Enforce require_human_approval=True for critical actions.
Implement thread-safe mechanisms for ctx.call_count using locks or atomic operations.
Add logging and monitoring for policy circumventions to detect anomalies.

3. Trust Chain Weaknesses

Severity: 🟡 MEDIUM
Issue:
The 02_mcp_security_proxy.ipynb notebook defines detection patterns for MCP tool poisoning but does not validate the trust chain of MCP tools. For example, tools could be registered from untrusted sources without verifying their origin or authenticity.

Attack Vector:
An attacker could inject malicious tools into the agent's environment by exploiting the lack of trust chain validation.

Recommendation:

Implement SPIFFE/SVID-based identity validation for MCP tools to ensure they originate from trusted sources.
Add certificate pinning or cryptographic signatures to verify the integrity of tool definitions.

4. Credential Exposure

Severity: 🔵 LOW
Issue:
No explicit credential exposure was found in the notebooks. However, the audit trail printed in 01_policy_enforcement_101.ipynb could potentially leak sensitive data if the input contains confidential information (e.g., SSNs).

Attack Vector:
If the audit trail is logged or shared, sensitive data like SSNs could be exposed.

Recommendation:

Mask sensitive data in the audit trail (e.g., redact SSNs or other PII).
Implement a secure logging mechanism that encrypts sensitive information.

5. Sandbox Escape

Severity: 🔴 CRITICAL
Issue:
The 02_mcp_security_proxy.ipynb notebook demonstrates the detection of malicious tool definitions, but the trust_gate function only blocks tools based on predefined patterns. It does not enforce sandboxing or isolation for tools that are allowed. This creates a risk of sandbox escape if a malicious tool is inadvertently allowed.

Attack Vector:
An attacker could exploit the lack of sandboxing to execute arbitrary code or access sensitive resources on the host system.

Recommendation:

Enforce containerization or process isolation for all MCP tools.
Use technologies like Docker or Firecracker to ensure tools run in isolated environments.
Implement strict resource access controls for tools.

6. Deserialization Attacks

Severity: 🟠 HIGH
Issue:
The notebooks use JSON-like structures for tool definitions and audit trails but do not explicitly validate or sanitize these inputs. If these structures are loaded from external sources, they could be exploited for deserialization attacks.

Attack Vector:
An attacker could craft malicious JSON payloads to execute arbitrary code during deserialization.

Recommendation:

Use safe deserialization libraries that enforce strict schema validation.
Reject any input that does not conform to the expected schema.

7. Race Conditions

Severity: 🟠 HIGH
Issue:
The ctx.call_count mechanism in 01_policy_enforcement_101.ipynb is vulnerable to race conditions in multi-threaded environments. Concurrent calls could manipulate the count, bypassing the max_tool_calls limit.

Attack Vector:
An attacker could exploit this vulnerability to exceed the call budget, potentially overwhelming the system or bypassing governance checks.

Recommendation:

Replace ctx.call_count with an atomic counter or use a thread-safe locking mechanism.
Test the implementation under concurrent load to ensure robustness.

8. Supply Chain Risks

Severity: 🟡 MEDIUM
Issue:
The notebooks rely on the agent-governance-toolkit package but do not verify its integrity. This creates a risk of supply chain attacks, such as dependency confusion or malicious package injection.

Attack Vector:
An attacker could publish a malicious package with a similar name to agent-governance-toolkit and compromise the system.

Recommendation:

Use dependency pinning to ensure the correct version of agent-governance-toolkit is installed.
Verify the package's integrity using checksums or signatures.
Consider using tools like pip-audit to detect vulnerable dependencies.

Summary of Findings:

Finding	Severity	Recommendation
Prompt Injection Defense Bypass	🔴 CRITICAL	Use robust pattern detection and semantic analysis to prevent obfuscation bypasses.
Policy Engine Circumvention	🟠 HIGH	Enforce thread-safe mechanisms and require human approval for critical actions.
Trust Chain Weaknesses	🟡 MEDIUM	Implement SPIFFE/SVID validation and cryptographic signatures for MCP tools.
Credential Exposure	🔵 LOW	Mask sensitive data in audit trails and use secure logging mechanisms.
Sandbox Escape	🔴 CRITICAL	Enforce containerization and resource isolation for MCP tools.
Deserialization Attacks	🟠 HIGH	Use safe deserialization libraries with schema validation.
Race Conditions	🟠 HIGH	Implement thread-safe counters and test under concurrent load.
Supply Chain Risks	🟡 MEDIUM	Pin dependencies and verify package integrity using checksums or signatures.

Final Recommendation:

The notebooks provide valuable educational resources but introduce critical security risks that must be addressed before deployment. Prioritize fixes for prompt injection defenses, sandboxing, and race conditions to ensure the governance layer remains secure.

Ricky-G · 2026-03-30T09:09:38Z

@kanish5 could you please look at the critical issues pointed in the code review above, for each if they dont apply (as this is a playground notebook) please comment on these.

imran-siddique

LGTM - all mandatory checks pass.

github-actions

🤖 AI Agent: code-reviewer

Review Summary

This pull request introduces three Google Colab notebooks designed to provide an interactive, zero-friction trial experience for the Agent Governance Toolkit. The notebooks cover policy enforcement, MCP security proxy, and multi-agent governance. While the notebooks are well-structured and provide valuable demonstrations, there are several areas that require attention to ensure security, correctness, and compliance with best practices.

🔴 CRITICAL

Potential Regular Expression Denial of Service (ReDoS) in MCP Security Proxy:
- The DETECTION_PATTERNS in 02_mcp_security_proxy.ipynb include regular expressions that could be exploited for ReDoS attacks. For example:
```
r"ignore\s+(all\s+)?previous"
r"disregard\s+(all\s+)?(above|prior|previous)"
```
  These patterns contain nested quantifiers (\s+ and (all\s+)?), which can lead to catastrophic backtracking if an attacker crafts malicious input. This could result in a denial-of-service vulnerability.
- Action: Refactor the regex patterns to avoid nested quantifiers or use regex libraries that are resistant to ReDoS attacks, such as regex (an alternative to Python's re module).

Insufficient Validation of GovernancePolicy:

In 01_policy_enforcement_101.ipynb, the GovernancePolicy object is instantiated directly without validating the blocked_patterns or other fields. This could lead to runtime errors or unexpected behavior if invalid inputs are provided.

Action: Use Pydantic models to validate the GovernancePolicy object. For example:

from pydantic import BaseModel, Field, validator
from typing import List

class GovernancePolicyModel(BaseModel):
    name: str
    blocked_patterns: List[str]
    require_human_approval: bool
    max_tool_calls: int = Field(ge=1)

    @validator('blocked_patterns', each_item=True)
    def validate_patterns(cls, pattern):
        try:
            re.compile(pattern)
        except re.error:
            raise ValueError(f"Invalid regex pattern: {pattern}")
        return pattern

Hardcoded Detection Patterns:
- The detection patterns in 02_mcp_security_proxy.ipynb are hardcoded in the notebook. This makes it difficult to update or extend the patterns without modifying the notebook itself.
- Action: Externalize the detection patterns into a configuration file (e.g., mcp-security.yaml) and load them dynamically in the notebook. This will also make it easier to manage and update the patterns in the future.
Lack of Cryptographic Validation in MCP Security Proxy:
- The 02_mcp_security_proxy.ipynb notebook does not include any cryptographic validation for the integrity or authenticity of the MCP tool definitions. This could allow an attacker to tamper with the tool definitions.
- Action: Implement cryptographic signing and verification for MCP tool definitions using a library like cryptography. Ensure that only signed and verified tool definitions are accepted.

🟡 WARNING

Backward Compatibility Concerns:
- The notebooks introduce new functionality and examples, but they rely on existing APIs (e.g., GovernancePolicy, LangChainKernel, pre_execute, etc.). If these APIs are modified in the future, the notebooks may break.
- Action: Add tests to ensure that the APIs used in the notebooks remain backward-compatible. Consider adding integration tests that run the notebooks as part of the CI/CD pipeline.
Thread Safety in Multi-Agent Governance:
- The 03_multi_agent_governance.ipynb notebook simulates multiple agents but does not address potential thread safety issues in concurrent execution. For example, the AgentMetrics class is not thread-safe.
- Action: If the toolkit is intended to support concurrent execution, ensure that shared state (e.g., AgentMetrics) is thread-safe. Use thread-safe data structures or synchronization mechanisms like threading.Lock.

💡 SUGGESTIONS

Add Type Annotations:
- While the code in the notebooks is functional, it lacks type annotations in many places. Adding type annotations will improve code readability and help catch type-related issues during development.
- Example:
```
def scan_tool_definition(tool_name: str, description: str) -> dict:
```
Improve Documentation:
- The notebooks provide a good overview of the toolkit's capabilities, but they could benefit from additional context and explanations. For example:
  - Explain the purpose of each step in more detail.
  - Provide links to relevant sections of the toolkit's documentation.
  - Include a glossary for terms like MCP, SLO, and circuit breaker.
Add Unit Tests for Notebook Code:
- While the notebooks are designed for interactive use, the code within them should be tested to ensure correctness. Extract reusable functions into separate modules and write unit tests for them.
Use Logging Instead of Print Statements:
- The notebooks use print statements for output, which is fine for interactive use but not ideal for production code. Consider using the logging module for better control over log levels and formatting.
Add Security Best Practices Section:
- Since this toolkit is security-focused, consider adding a section in the notebooks that highlights best practices for using the toolkit securely. For example:
  - How to define secure governance policies.
  - How to handle sensitive data in agent interactions.
Notebook Execution Validation:
- Add a CI/CD step to validate that the notebooks execute without errors. This will ensure that the examples remain functional as the toolkit evolves.

Final Recommendation

Address the 🔴 CRITICAL issues immediately, as they pose security risks.
Resolve the 🟡 WARNING issues to ensure backward compatibility and thread safety.
Consider implementing the 💡 SUGGESTIONS to improve code quality, usability, and maintainability.

Once these issues are resolved, the notebooks will provide a robust and secure demonstration of the Agent Governance Toolkit's capabilities.

kanish5 added 3 commits March 28, 2026 02:15

ci: use javascript-typescript language for explicit TypeScript coverage

51d6b6a

security: add Content-Security-Policy headers to HTML responses

9345536

docs: add Google Colab notebooks for zero-friction trial

71b9a94

github-actions bot added ci/cd CI/CD and workflows size/XL Extra large PR (500+ lines) labels Mar 30, 2026

github-actions bot reviewed Mar 30, 2026

View reviewed changes

imran-siddique approved these changes Mar 30, 2026

View reviewed changes

imran-siddique enabled auto-merge (squash) March 30, 2026 16:53

Merge branch 'main' into docs/add-colab-notebooks

1a420e3

github-actions bot removed the ci/cd CI/CD and workflows label Mar 30, 2026

github-actions bot reviewed Mar 30, 2026

View reviewed changes

imran-siddique merged commit 613c890 into microsoft:main Mar 30, 2026
55 of 58 checks passed

This was referenced Apr 3, 2026

📝 Blog Post: Why AI Agent Governance Matters in 2026 #694

Closed

�� Blog Post: I Added Governance to My AI Agent in 30 Minutes — Here's How #852

Closed

📝 Blog Post: Agent SRE — SLOs, Error Budgets, and Circuit Breakers for AI Agents #853

Closed

Conversation

kanish5 commented Mar 30, 2026

Summary

Details

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

kanish5 commented Mar 30, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

🤖 AI Agent: code-reviewer

Review Summary

🔴 CRITICAL

🟡 WARNING

💡 SUGGESTIONS

Additional Notes

Action Items

Uh oh!

github-actions bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Security Review of Pull Request: Google Colab Notebooks for Zero-Friction Trial

Findings:

1. Prompt Injection Defense Bypass

2. Policy Engine Circumvention

3. Trust Chain Weaknesses

4. Credential Exposure

5. Sandbox Escape

6. Deserialization Attacks

7. Race Conditions

8. Supply Chain Risks

Summary of Findings:

Final Recommendation:

Uh oh!

Ricky-G commented Mar 30, 2026

Uh oh!

imran-siddique left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

🤖 AI Agent: code-reviewer

Review Summary

🔴 CRITICAL

🟡 WARNING

💡 SUGGESTIONS

Final Recommendation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Mar 30, 2026 •

edited

Loading