Skip to content

release: v2.2.0 — ESRP compliance, configurable security policies, hardening#300

Merged
imran-siddique merged 6 commits intomicrosoft:mainfrom
imran-siddique:release/v2.2.0
Mar 18, 2026
Merged

release: v2.2.0 — ESRP compliance, configurable security policies, hardening#300
imran-siddique merged 6 commits intomicrosoft:mainfrom
imran-siddique:release/v2.2.0

Conversation

@imran-siddique
Copy link
Member

Release v2.2.0

Merges the release/v2.2.0 branch into main. Tagged as v2.2.0.

Security Fixes

  • Expand SQL policy deny-list to block GRANT, REVOKE, CREATE USER, EXEC xp_cmdshell, UPDATE without WHERE, MERGE
  • Replace XOR placeholder encryption with AES-256-GCM in DMZ module
  • Security advisories added to SECURITY.md (CostGuard kill bypass, thread safety)

Configurable Security Policies

  • All 8 security modules externalized to YAML configuration (10 sample configs in examples/policies/)
  • New \create_policies_from_config()\ API — load rules from YAML instead of hardcoded defaults
  • \create_default_policies()\ deprecated with runtime warning
  • Sample configs: sql-safety, sql-strict, sql-readonly, sandbox-safety, prompt-injection-safety, mcp-security, semantic-policy, pii-detection, conversation-guardian, cli-security-rules

ESRP Publishing Infrastructure

  • PyPI ADO pipeline (pipelines/pypi-publish.yml) using EsrpRelease@11
  • npm ADO pipeline (pipelines/npm-publish.yml) using EsrpRelease@11
  • GitHub Actions publish.yml updated to build-only (no direct publishing)

Package Metadata & Disclaimers

  • Python: author=Microsoft Corporation, team DL, MIT classifier fixes
  • npm: all packages renamed to @microsoft scope
  • Community Preview disclaimers on all READMEs, release notes, package descriptions
  • Security Model & Limitations section added to root README
  • Version bumped to 2.2.0 across all Python packages

imran-siddique and others added 4 commits March 17, 2026 13:20
Fix incomplete deny-list in no_destructive_sql policy that allowed
destructive SQL operations to bypass the policy engine (MSRC report).

Previously only DROP, TRUNCATE, DELETE (no WHERE), and ALTER TABLE
were blocked. Now blocks in both AST (sqlglot) and fallback (regex)
paths:
- GRANT / REVOKE (privilege escalation)
- CREATE/ALTER/DROP USER/ROLE/LOGIN (account manipulation)
- UPDATE without WHERE (mass data modification)
- EXEC/EXECUTE xp_cmdshell, sp_configure (OS command execution)
- MERGE INTO (combined insert/update/delete)
- LOAD DATA, INTO OUTFILE/DUMPFILE (file operations)

Also fixes pre-existing test issues (ExecutionRequest constructor)
and adds 21 new test cases covering all reported bypass vectors.

35 tests passing, 0 failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Refactor the policy engine so SQL deny-rules are driven by external
YAML configuration rather than hardcoded defaults. This shifts
responsibility to users to review and customize policies for their
specific environment.

Changes:
- Add SQLPolicyConfig dataclass for structured policy configuration
- Add load_sql_policy_config() to load from YAML files
- Add create_policies_from_config() as the recommended entry point
- Deprecate create_default_policies() with runtime warning directing
  users to explicit config files
- _fallback_sql_check() now accepts optional SQLPolicyConfig and
  builds regex patterns dynamically from config

Sample policy configs (examples/policies/):
- sql-safety.yaml — balanced default (blocks DROP/GRANT/etc.)
- sql-strict.yaml — high-security (SELECT-only)
- sql-readonly.yaml — read-only agents

All configs include prominent disclaimers that they are SAMPLES and
must be reviewed before production use.

40 tests passing (5 new config tests), 0 failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Make all security detection patterns, deny-lists, and thresholds
configurable via external YAML files. Built-in defaults are retained
but emit deprecation warnings directing users to explicit config files.

New sample configs in examples/policies/:
- sandbox-safety.yaml
- prompt-injection-safety.yaml
- mcp-security.yaml
- semantic-policy.yaml
- pii-detection.yaml
- conversation-guardian.yaml
- cli-security-rules.yaml

All configs include prominent disclaimers that they are SAMPLES and
must be reviewed before production use.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions bot added tests size/XL Extra large PR (500+ lines) labels Mar 18, 2026
@github-actions
Copy link

🤖 AI Agent: docs-sync-checker

📝 Documentation Sync Report

Issues Found

  1. create_policies_from_config() in packages/{name}/src/ — missing docstring
    • This new public API lacks a docstring explaining its purpose, parameters, return values, and exceptions.
  2. ⚠️ packages/{name}/README.md — "Configuration" section may need an update to include details about the new create_policies_from_config() API and the deprecation of create_default_policies().
  3. ⚠️ CHANGELOG.md — No entry found for the new create_policies_from_config() API or the deprecation of create_default_policies().
  4. ⚠️ examples/policies/ — New sample configurations have been added, but there is no mention of these in the README or other documentation. These should be referenced in the appropriate sections of the README and/or docs.

Suggestions

  • 💡 Add a detailed docstring for create_policies_from_config(config_path: str) -> Dict[str, Any]:
    • Purpose: Explain that this function loads security policies from a YAML configuration file.
    • Parameters: Describe the config_path parameter (e.g., the path to the YAML configuration file).
    • Return Values: Describe the structure of the returned dictionary (e.g., keys and values).
    • Exceptions: Mention any exceptions that might be raised (e.g., FileNotFoundError, yaml.YAMLError).
  • 💡 Update the "Configuration" section in packages/{name}/README.md to:
    • Introduce the new create_policies_from_config() API.
    • Mention the deprecation of create_default_policies() and provide guidance on transitioning to the new API.
    • Highlight the availability of the new sample configurations in examples/policies/.
  • 💡 Add an entry to CHANGELOG.md for version v2.2.0:
    • Include details about the new create_policies_from_config() API.
    • Mention the deprecation of create_default_policies() with a runtime warning.
    • Highlight the addition of the new sample configurations in examples/policies/.
  • 💡 Ensure that the example code in examples/ reflects the new create_policies_from_config() API and demonstrates its usage.

Additional Notes

  • The new YAML configuration files in examples/policies/ are well-documented with comments, which is excellent. However, their existence and purpose should be explicitly mentioned in the relevant documentation (e.g., README or project-level docs).
  • Ensure that all public APIs, including create_policies_from_config(), have complete type annotations. While the provided diff suggests that the function has type hints, this should be verified in the actual code.

Please address the above issues to ensure documentation and examples are in sync with the new changes. Let me know if you need further assistance!

@github-actions
Copy link

🤖 AI Agent: breaking-change-detector

🔍 API Compatibility Report

Summary

The release introduces several new features and configurations, including externalized security policies and ESRP compliance. While these changes are largely additive, there are some potentially breaking changes due to deprecations and modifications to existing APIs. No outright breaking changes were identified, but there are changes that could impact downstream users depending on their usage patterns.

Findings

Severity Package Change Impact
🟡 agent-governance-toolkit create_default_policies() deprecated with runtime warning Users relying on this function will need to migrate to create_policies_from_config()
🔵 agent-governance-toolkit Added create_policies_from_config() New API for loading policies from YAML configuration files
🔵 agent-governance-toolkit Added 10 sample YAML policy configurations Provides new functionality for defining security policies

Migration Guide

  1. For users of create_default_policies():

    • Replace calls to create_default_policies() with create_policies_from_config().
    • Use the provided sample YAML configurations as a starting point to define your policies.
  2. For users of older versions:

    • Review the new security policies and configurations provided in the examples/policies/ directory.
    • Update your code to use the new create_policies_from_config() API for loading policies from YAML files.

Conclusion

No breaking changes detected. However, the deprecation of create_default_policies() and the introduction of create_policies_from_config() may require updates to existing codebases. Downstream users should review and adapt their implementations accordingly.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: release: v2.2.0 — ESRP compliance, configurable security policies, hardening

Summary

This pull request introduces significant updates to the microsoft/agent-governance-toolkit repository, including configurable security policies, ESRP compliance, and cryptographic hardening. While the changes are generally positive, there are several areas that require attention to ensure correctness, security, and backward compatibility.


🔴 CRITICAL: Security Issues

  1. Regex Patterns for Sensitive Data Detection

    • Issue: Some regex patterns for detecting sensitive data (e.g., API keys, passwords, private keys) in the sample configurations (cli-security-rules.yaml, pii-detection.yaml) are overly simplistic and may result in false negatives. For example:
      • The regex for detecting hardcoded passwords ('(password|passwd|pwd)\s*[=:]\s*["\u0027][^"\u0027]+["\u0027]') does not account for cases where passwords are stored in variables without quotes or are concatenated dynamically.
      • The regex for detecting AWS Access Key IDs ('AKIA[0-9A-Z]{16}') does not account for other formats of access keys or potential obfuscation techniques.
    • Recommendation: Use more robust regex patterns or integrate specialized libraries like truffleHog for secret scanning.
  2. Sandbox Escape Vectors

    • Issue: The sandbox-safety.yaml configuration blocks dangerous modules and builtins, but it does not account for indirect imports or alternative methods of code execution. For example:
      • importlib is blocked, but __import__ can still be used to dynamically load modules.
      • Blocking exec, eval, and compile is good, but other methods like execfile() (Python 2 compatibility) or pickle.loads() can also execute arbitrary code.
    • Recommendation: Consider using a more comprehensive sandboxing solution, such as PyPy sandboxing or RestrictedPython, to enforce stricter controls.
  3. AES-256-GCM Implementation

    • Issue: The pull request mentions replacing XOR encryption with AES-256-GCM in the DMZ module. However, there is no code snippet provided for review. Improper implementation of AES encryption (e.g., reusing nonces or keys) can lead to vulnerabilities.
    • Recommendation: Ensure that:
      • Nonces are unique for every encryption operation.
      • Keys are securely generated and stored using a key management system (e.g., Azure Key Vault).
      • Cryptographic operations are audited for compliance with NIST standards.
  4. Thread Safety in Concurrent Agent Execution

    • Issue: The pull request mentions thread safety improvements but does not provide details. If agents share mutable state or resources, race conditions could lead to security vulnerabilities.
    • Recommendation: Verify that all shared resources (e.g., policy configurations, cryptographic keys) are properly synchronized using thread-safe mechanisms like locks or thread-local storage.

🟡 WARNING: Potential Breaking Changes

  1. Deprecation of create_default_policies()

    • Issue: Deprecating the create_default_policies() API introduces a runtime warning, which may disrupt existing workflows that rely on this function.
    • Recommendation: Provide a clear migration path in the documentation, including examples of how to transition to create_policies_from_config().
  2. Policy Configuration Externalization

    • Issue: Externalizing security policies to YAML files changes the default behavior of the library. Existing users may experience unexpected behavior if they do not update their configurations.
    • Recommendation: Add backward compatibility by allowing users to opt-in to the new configuration system while maintaining support for hardcoded defaults.

💡 Suggestions for Improvement

  1. OWASP Agentic Top 10 Compliance

    • Observation: The new policies address several OWASP Agentic Top 10 risks (e.g., ASI-8, ASI-10, LLM01). However, additional risks like ASI-2 (Data Poisoning) and ASI-6 (Over-reliance on AI) could be addressed.
    • Recommendation: Expand the policy configurations to include rules for detecting data poisoning attempts (e.g., adversarial examples) and scenarios where agents may over-rely on AI-generated outputs.
  2. Policy Validation

    • Observation: The YAML configurations are critical for security, but there is no mention of validation or schema enforcement.
    • Recommendation: Use Pydantic models to validate the structure and content of the YAML files before loading them into the policy engine.
  3. Documentation

    • Observation: While disclaimers are included in the sample configurations, the documentation does not provide sufficient guidance on customizing policies for specific use cases.
    • Recommendation: Expand the documentation to include:
      • Best practices for writing custom policies.
      • Examples of real-world scenarios and how to address them using the toolkit.
      • Guidance on testing and validating policies.
  4. Testing Coverage

    • Observation: The pull request does not include updates to the test suite for the new features.
    • Recommendation: Add unit tests for:
      • The create_policies_from_config() API.
      • Validation of YAML configurations.
      • Cryptographic operations in the DMZ module.
      • Thread safety in concurrent agent execution.
  5. Backward Compatibility

    • Observation: Deprecating create_default_policies() without a clear migration path may alienate existing users.
    • Recommendation: Maintain backward compatibility by providing a transitional period where both APIs are supported.

Conclusion

This pull request introduces valuable features and improvements, but there are critical security issues and potential breaking changes that need to be addressed before merging. The recommendations provided above should be implemented to ensure the robustness, security, and usability of the library.

Actions Required:

  1. Address the 🔴 CRITICAL security issues.
  2. Mitigate 🟡 WARNING potential breaking changes.
  3. Implement 💡 Suggestions for improvement.

Let me know if you need further assistance or clarification!

@github-actions
Copy link

🤖 AI Agent: test-generator

🧪 Test Coverage Analysis

packages/agent-os/src/agent_os/cli/__init__.py

  • ✅ Existing coverage: Basic CLI initialization and argument parsing are likely covered if tests for CLI functionality exist.
  • ❌ Missing coverage: Specific edge cases for new CLI security rules and their integration with the create_policies_from_config API.
  • 💡 Suggested test cases:
    1. test_cli_security_rules_loading — Verify that the CLI correctly loads and applies security rules from the new YAML configuration file.
    2. test_cli_invalid_config_handling — Test behavior when the YAML configuration file is missing, malformed, or contains invalid rules.
    3. test_cli_security_rule_violation_detection — Simulate CLI inputs that violate the security rules and ensure they are flagged appropriately.

packages/agent-os/src/agent_os/integrations/conversation_guardian.py

  • ✅ Existing coverage: Likely covers basic functionality of the conversation guardian, such as detecting offensive language or escalation patterns.
  • ❌ Missing coverage: Edge cases for new thresholds and patterns introduced in the conversation-guardian.yaml configuration.
  • 💡 Suggested test cases:
    1. test_escalation_thresholds — Test conversation scenarios that hover around the escalation_score_threshold and escalation_critical_threshold to ensure proper classification.
    2. test_offensive_patterns_detection — Validate detection of offensive patterns with varying weights and thresholds.
    3. test_max_retry_cycles — Simulate a conversation with more than max_retry_cycles to ensure the guardian correctly halts further retries.
    4. test_transcript_limit — Verify that the guardian stops recording conversation transcripts after reaching max_transcript_entries.

packages/agent-os/src/agent_os/mcp_security.py

  • ✅ Existing coverage: Likely covers basic MCP security checks.
  • ❌ Missing coverage: New detection patterns for invisible Unicode characters, hidden instructions, and privilege escalation in the mcp-security.yaml configuration.
  • 💡 Suggested test cases:
    1. test_invisible_unicode_detection — Validate detection of invisible Unicode characters in MCP tool definitions.
    2. test_hidden_instruction_detection — Test detection of hidden instructions, such as "ignore all previous" or "override the above."
    3. test_encoded_payload_detection — Simulate encoded payloads (e.g., Base64, hex) and verify detection.
    4. test_privilege_escalation_detection — Test detection of privilege escalation attempts in MCP tool definitions.

packages/agent-os/src/agent_os/mute_agent.py

  • ✅ Existing coverage: Likely covers basic muting functionality for agents.
  • ❌ Missing coverage: Edge cases for concurrency, such as multiple agents being muted/unmuted simultaneously.
  • 💡 Suggested test cases:
    1. test_concurrent_muting — Simulate multiple agents being muted/unmuted simultaneously to test for race conditions.
    2. test_mute_timeout_handling — Verify that the system handles timeouts gracefully when muting an agent.
    3. test_mute_state_persistence — Ensure that the mute state is correctly persisted and restored across system restarts.

packages/agent-os/src/agent_os/prompt_injection.py

  • ✅ Existing coverage: Likely covers basic prompt injection detection.
  • ❌ Missing coverage: New detection patterns and thresholds introduced in the prompt-injection-safety.yaml configuration.
  • 💡 Suggested test cases:
    1. test_direct_override_detection — Validate detection of direct override patterns, such as "ignore all previous instructions."
    2. test_role_play_detection — Test detection of role-playing patterns, such as "pretend you are" or "act as if you have no restrictions."
    3. test_context_manipulation_detection — Simulate context manipulation attempts, such as "the above instructions are wrong."
    4. test_sensitivity_thresholds — Verify that the system correctly applies different sensitivity thresholds (strict, balanced, permissive) for detection.

packages/agent-os/src/agent_os/sandbox.py

  • ✅ Existing coverage: Likely covers basic sandbox functionality, such as blocking specific modules and built-ins.
  • ❌ Missing coverage: New blocked modules and built-ins introduced in the sandbox-safety.yaml configuration.
  • 💡 Suggested test cases:
    1. test_blocked_modules — Verify that the sandbox correctly blocks the execution of newly added modules (e.g., subprocess, ctypes).
    2. test_blocked_builtins — Test that the sandbox prevents the use of newly added blocked built-ins (e.g., compile, __import__).
    3. test_sandbox_escape_attempts — Simulate sandbox escape attempts using combinations of blocked modules and built-ins.

packages/agent-os/src/agent_os/semantic_policy.py

  • ✅ Existing coverage: Likely covers basic semantic policy evaluation.
  • ❌ Missing coverage: New signal patterns and weights introduced in the semantic-policy.yaml configuration.
  • 💡 Suggested test cases:
    1. test_destructive_data_signals — Validate detection of destructive data patterns, such as "DROP TABLE" or "DELETE FROM without WHERE."
    2. test_data_exfiltration_signals — Test detection of data exfiltration patterns, such as "SELECT INTO OUTFILE" or "curl |."
    3. test_privilege_escalation_signals — Verify detection of privilege escalation patterns, such as "GRANT ALL" or "sudo."
    4. test_signal_weight_thresholds — Simulate scenarios where signal weights are close to the defined thresholds to ensure proper classification.

Summary

The changes in this release introduce significant new functionality, particularly around configurable security policies and detection patterns. While there is likely existing test coverage for basic functionality, the new YAML configurations and their integration into the system introduce potential gaps in test coverage. The suggested test cases focus on domain-specific edge cases, such as policy evaluation, input validation, and concurrency, to ensure robust coverage for the new features.

@github-actions
Copy link

🤖 AI Agent: security-scanner

Security Review for microsoft/agent-governance-toolkit Pull Request: release: v2.2.0

Summary

This pull request introduces several new features, including configurable security policies via YAML, enhanced SQL policy deny-lists, and improved encryption for the DMZ module. While these changes aim to enhance security, they also introduce potential risks. Below is a detailed security review based on the specified criteria.


Findings

1. Prompt Injection Defense Bypass

Severity: 🔴 CRITICAL

Issue:
The prompt-injection-safety.yaml configuration file introduces detection patterns for prompt injection attacks. However, the patterns are limited to specific keywords and phrases, which can be bypassed using obfuscation techniques such as character encoding, whitespace injection, or synonyms. For example:

  • Using Unicode homoglyphs (e.g., replacing "ignore" with "ignоre" using a Cyrillic 'о').
  • Splitting keywords with whitespace or special characters (e.g., "ig nore all instructions").
  • Using synonyms or paraphrased instructions (e.g., "disregard" could be replaced with "omit" or "neglect").

Attack Vector:
An attacker could craft a prompt that bypasses these patterns by using obfuscation techniques, leading to the AI agent executing malicious or unintended instructions.

Recommendation:

  • Implement a semantic analysis engine to detect intent rather than relying solely on regex patterns. Use natural language processing (NLP) techniques to identify paraphrased or obfuscated instructions.
  • Regularly update the detection patterns to include new bypass techniques.
  • Introduce a mechanism to monitor and log prompt injection attempts for continuous improvement of detection rules.

2. Policy Engine Circumvention

Severity: 🟠 HIGH

Issue:
The new create_policies_from_config() API allows users to load security policies from YAML files. While this provides flexibility, it also introduces the risk of misconfiguration or malicious tampering with the YAML files. For example:

  • A user could unintentionally deploy a policy with overly permissive rules.
  • An attacker with access to the YAML files could modify them to weaken security policies.

Attack Vector:
If an attacker gains access to the YAML configuration files, they could modify the policies to bypass security checks, potentially leading to unauthorized actions or data exfiltration.

Recommendation:

  • Implement strong validation and schema enforcement for the YAML files. Use a library like jsonschema to validate the structure and content of the configuration files.
  • Enforce strict access controls on the YAML files to prevent unauthorized modifications.
  • Log and monitor changes to the configuration files and alert administrators of any unauthorized changes.

3. Trust Chain Weaknesses

Severity: 🔵 LOW

Issue:
The pull request does not explicitly address trust chain validation for the ESRP publishing infrastructure. While the use of EsrpRelease@11 is mentioned, there is no evidence of additional measures like certificate pinning or SPIFFE/SVID validation.

Attack Vector:
If the ESRP infrastructure or its dependencies are compromised, malicious actors could inject unauthorized code into the published packages.

Recommendation:

  • Ensure that the ESRP infrastructure uses certificate pinning and validates the integrity of the build artifacts before publishing.
  • Implement a secure signing process for the published packages and verify the signatures during deployment.

4. Credential Exposure

Severity: 🔵 LOW

Issue:
The cli-security-rules.yaml file includes patterns to detect hardcoded credentials (e.g., API keys, passwords). However, there is no mention of automated testing or validation to ensure that these patterns are effective.

Attack Vector:
If the patterns fail to detect hardcoded credentials, sensitive information could be exposed in logs, error messages, or source code.

Recommendation:

  • Integrate automated tests to validate the effectiveness of the credential detection patterns.
  • Provide guidance on securely managing credentials, such as using environment variables or secrets management tools.

5. Sandbox Escape

Severity: 🟠 HIGH

Issue:
The sandbox-safety.yaml file blocks certain Python modules and built-ins to prevent sandbox escapes. However, the list of blocked modules and functions is not exhaustive. For example:

  • The os module is blocked, but specific functions like os.system or os.popen are not explicitly mentioned.
  • The subprocess module is blocked, but alternative methods like multiprocessing.Process or ctypes could still be used for sandbox escapes.

Attack Vector:
An attacker could use alternative methods or modules not explicitly blocked to execute arbitrary code or escape the sandbox.

Recommendation:

  • Use a deny-by-default approach, where only explicitly allowed modules and functions are permitted.
  • Consider using a secure sandboxing library or framework, such as PyPy's sandboxing or Docker containers with strict resource limits.
  • Regularly review and update the list of blocked modules and functions.

6. Deserialization Attacks

Severity: 🟠 HIGH

Issue:
The create_policies_from_config() API relies on YAML files for configuration. If the YAML parser used is not secure, it could be vulnerable to deserialization attacks, allowing attackers to execute arbitrary code.

Attack Vector:
An attacker could craft a malicious YAML file that exploits deserialization vulnerabilities in the YAML parser, leading to remote code execution.

Recommendation:

  • Use a safe YAML parser, such as ruamel.yaml or PyYAML with safe_load instead of load.
  • Validate the content of the YAML files against a strict schema before processing.

7. Race Conditions

Severity: 🟡 MEDIUM

Issue:
The pull request does not address potential race conditions in the policy engine, especially when multiple threads or processes are accessing or modifying the loaded policies.

Attack Vector:
A race condition could allow an attacker to modify or bypass security policies during runtime, leading to inconsistent or insecure behavior.

Recommendation:

  • Implement thread-safe mechanisms for accessing and modifying policies, such as using locks or atomic operations.
  • Test the policy engine under concurrent access scenarios to identify and mitigate potential race conditions.

8. Supply Chain Risks

Severity: 🟠 HIGH

Issue:
The pull request updates the publishing pipelines for PyPI and npm but does not mention any measures to mitigate supply chain risks, such as dependency confusion or typosquatting.

Attack Vector:
An attacker could exploit dependency confusion or typosquatting to inject malicious code into the project during the build or deployment process.

Recommendation:

  • Use dependency pinning to lock versions of all dependencies.
  • Regularly audit dependencies for known vulnerabilities using tools like npm audit or pip-audit.
  • Implement a package signing mechanism to ensure the integrity of published packages.

Summary of Findings

Finding Severity Recommendation
Prompt injection defense bypass 🔴 CRITICAL Use semantic analysis and NLP for intent detection; update patterns regularly.
Policy engine circumvention 🟠 HIGH Validate YAML schemas; enforce access controls and monitor changes.
Trust chain weaknesses 🔵 LOW Use certificate pinning and secure signing for ESRP publishing infrastructure.
Credential exposure 🔵 LOW Add automated tests for credential detection patterns; provide secure practices.
Sandbox escape 🟠 HIGH Use deny-by-default for modules/functions; consider secure sandboxing frameworks.
Deserialization attacks 🟠 HIGH Use safe YAML parsers and validate against strict schemas.
Race conditions 🟡 MEDIUM Implement thread-safe mechanisms; test for concurrent access issues.
Supply chain risks 🟠 HIGH Use dependency pinning, audit dependencies, and implement package signing.

Final Recommendation

While this pull request introduces valuable security enhancements, it also introduces potential risks, particularly in the areas of prompt injection defense, policy engine circumvention, sandbox escapes, deserialization attacks, and supply chain security. Addressing these issues is critical to ensure the robustness of the microsoft/agent-governance-toolkit.

…NGELOG

Address PR review feedback — add create_policies_from_config() API,
SQLPolicyConfig, 10 sample configs, and create_default_policies()
deprecation to the v2.2.0 CHANGELOG section.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Mar 18, 2026
@github-actions
Copy link

🤖 AI Agent: breaking-change-detector

🔍 API Compatibility Report

Summary

This release introduces several new features, security enhancements, and infrastructure updates. While most changes are additive, there are some potentially breaking changes due to deprecations and modifications to existing APIs. No outright breaking changes were detected, but caution is advised for deprecated functionality.

Findings

Severity Package Change Impact
🟡 agent-runtime create_default_policies() deprecated with runtime warning Users relying on this function must migrate to create_policies_from_config()
🔵 agent-runtime create_policies_from_config() added New API for loading policies from YAML configuration
🔵 agent-runtime SQLPolicyConfig dataclass and load_sql_policy_config() added Structured policy loading introduced
🔵 agent-runtime 10 sample policy configurations added Provides examples for configurable security rules

Migration Guide

For create_default_policies() Users:

  • Impact: The function is deprecated and emits a runtime warning. It will likely be removed in a future release.
  • Migration: Replace calls to create_default_policies() with create_policies_from_config() and provide explicit YAML configurations. Refer to the sample configurations in examples/policies/.

For New Features:

  • Usage: Utilize create_policies_from_config() to load security policies dynamically from YAML files. Use the provided sample configurations as templates to define custom policies.

Conclusion

No breaking changes detected. The release is safe for upgrade, but users should update their code to accommodate the deprecation of create_default_policies() and leverage the new configuration-based policy system.

@github-actions
Copy link

🤖 AI Agent: docs-sync-checker

📝 Documentation Sync Report

Issues Found

  1. create_policies_from_config() in packages/{name}/src/ — missing docstring.
  2. ⚠️ packages/{name}/README.md — no mention of the new create_policies_from_config() API or the deprecation of create_default_policies().
  3. ⚠️ CHANGELOG.md — while the new create_policies_from_config() API and the deprecation of create_default_policies() are mentioned, the CHANGELOG does not explicitly describe the behavioral changes or the impact of the deprecation.
  4. ⚠️ examples/ — new YAML configuration files are added, but there is no mention of them in the README or any other documentation. Their purpose and usage should be documented.

Suggestions

  • 💡 Add a docstring for create_policies_from_config(). Include details about its purpose, parameters, return values, and any exceptions it may raise.
  • 💡 Update packages/{name}/README.md to:
    • Document the new create_policies_from_config() API, including how to use it with the YAML configurations.
    • Mention the deprecation of create_default_policies() and provide guidance on transitioning to the new API.
    • Add a section explaining the new YAML configuration files in examples/policies/, including their purpose and how to use them.
  • 💡 Expand the CHANGELOG entry for create_policies_from_config() and the deprecation of create_default_policies() to include more details about the behavioral changes and their impact.
  • 💡 Ensure that any example code in examples/ or packages/{name}/README.md is updated to reflect the new create_policies_from_config() API and the deprecation of create_default_policies().

Type Hints

  • The new create_policies_from_config() API should include complete type annotations for its parameters and return type.

Final Assessment

The documentation is not yet in sync. Please address the issues and suggestions above to ensure consistency and clarity across the repository.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: Release v2.2.0

This release introduces significant updates to the microsoft/agent-governance-toolkit repository, including security fixes, configurable security policies, ESRP compliance, and hardening measures. Below is a detailed review of the changes based on the focus areas.


🔴 CRITICAL: Security Issues

  1. AES-256-GCM Implementation

    • The replacement of XOR placeholder encryption with AES-256-GCM is a critical improvement. However, ensure that:
      • The AES key is securely generated using a cryptographically secure random number generator (e.g., os.urandom or secrets module).
      • The nonce/IV is unique for every encryption operation to prevent vulnerabilities like nonce reuse attacks.
      • Key management practices are robust (e.g., keys stored in a secure vault).
    • Action: Verify the implementation of AES-256-GCM in the DMZ module for adherence to cryptographic best practices.
  2. Sandbox Escape Vectors

    • The sandbox-safety.yaml configuration blocks dangerous modules (subprocess, os, shutil, etc.) and built-ins (exec, eval, etc.). However:
      • Ensure that dynamic imports via importlib or __import__ are comprehensively blocked.
      • Validate that the sandbox cannot be bypassed through indirect means (e.g., loading malicious code via pickle or marshal).
    • Action: Conduct penetration testing to confirm the sandbox's robustness against escape vectors.
  3. SQL Policy Deny-List Expansion

    • Blocking dangerous SQL operations like GRANT, REVOKE, CREATE USER, EXEC xp_cmdshell, and UPDATE without WHERE is essential. However:
      • Ensure that the deny-list is applied consistently across all modules where SQL queries are processed.
      • Validate that the deny-list cannot be bypassed by obfuscation techniques (e.g., using comments or concatenation).
    • Action: Add unit tests to verify that all deny-list patterns are correctly enforced.
  4. Prompt Injection Safety

    • The prompt-injection-safety.yaml configuration includes patterns for detecting prompt injection attacks. However:
      • Ensure that the detection logic accounts for encoded payloads (e.g., Base64, hex) and multi-turn manipulation.
      • Validate that the sensitivity thresholds (strict, balanced, permissive) are correctly applied in runtime.
    • Action: Perform adversarial testing to confirm the effectiveness of prompt injection detection.

🟡 WARNING: Breaking Changes

  1. Deprecation of create_default_policies()

    • The deprecation of create_default_policies() in favor of create_policies_from_config() introduces a runtime warning. This change may break existing integrations that rely on the default policy creation method.
    • Action: Provide clear migration instructions in the documentation and ensure backward compatibility by maintaining the deprecated method for at least one major release cycle.
  2. Policy Configuration Externalization

    • Externalizing security policies to YAML files is a significant architectural change. Ensure that:
      • Existing users are informed about the migration path.
      • The new configuration system is backward-compatible with hardcoded defaults.
    • Action: Add a fallback mechanism to load hardcoded defaults if YAML files are missing or invalid.

💡 Suggestions for Improvement

  1. Thread Safety in Concurrent Agent Execution

    • While thread safety issues are mentioned in the security advisories, ensure that:
      • Shared resources (e.g., policy configurations, cryptographic keys) are protected using synchronization primitives like locks or semaphores.
      • The codebase is free from race conditions, especially in modules handling concurrent agent execution.
    • Action: Use tools like pytest-xdist to simulate concurrent execution and identify thread safety issues.
  2. OWASP Agentic Top 10 Compliance

    • The release addresses several OWASP Agentic Top 10 risks (e.g., ASI-1, ASI-8, ASI-10). However:
      • Consider adding explicit tests for ASI-2 (Agent Identity Spoofing) and ASI-7 (Agent Memory Manipulation).
      • Expand the conversation-guardian.yaml configuration to detect memory manipulation attempts.
    • Action: Add compliance tests for all OWASP Agentic Top 10 risks.
  3. Type Safety and Pydantic Model Validation

    • The introduction of SQLPolicyConfig and load_sql_policy_config() is a positive step. Ensure that:
      • All YAML configurations are validated using Pydantic models.
      • Validation errors are logged and handled gracefully.
    • Action: Add unit tests to verify the correctness of Pydantic model validation.
  4. Policy Examples

    • The addition of 10 sample policy configurations is helpful. However:
      • Include detailed documentation for each policy, explaining its purpose, usage, and limitations.
      • Provide guidelines for customizing the policies based on specific use cases.
    • Action: Expand the examples/policies/ directory with README files for each policy.

Summary of Actions

Critical

  • Verify AES-256-GCM implementation for cryptographic best practices.
  • Test sandbox for escape vectors and indirect bypass methods.
  • Add unit tests for SQL deny-list enforcement.
  • Perform adversarial testing for prompt injection detection.

Warning

  • Provide migration instructions for create_default_policies() deprecation.
  • Ensure backward compatibility for policy configuration externalization.

Suggestions

  • Test thread safety in concurrent agent execution.
  • Add OWASP Agentic Top 10 compliance tests.
  • Validate YAML configurations using Pydantic models.
  • Document and expand sample policy configurations.

Final Verdict

This release introduces critical security improvements and valuable features, but it also carries potential breaking changes and areas requiring further validation. Address the flagged issues and suggestions to ensure a robust and secure release.

@github-actions
Copy link

🤖 AI Agent: security-scanner

Security Review for PR: release: v2.2.0 — ESRP compliance, configurable security policies, hardening


1. Prompt Injection Defense Bypass

Finding: 🔴 CRITICAL

The prompt-injection-safety.yaml configuration file introduces detection patterns for prompt injection attacks. However, the patterns are not comprehensive and may fail to detect more sophisticated prompt injection attempts. For example:

  • The direct_override patterns do not account for variations in spacing, capitalization, or obfuscation (e.g., "ignore ALL previous instructions", "IGNORE all PreVIOUS instructions").
  • The role_play patterns do not account for synonyms or alternative phrasing (e.g., "assume the role of", "you should act like").
  • The context_manipulation patterns do not account for indirect references to instructions (e.g., "the instructions above are incorrect" vs. "the instructions provided earlier are wrong").

Attack Vector:
An attacker could craft a prompt that bypasses these detection patterns by using slight variations in phrasing, spacing, or encoding. This could lead to the AI agent executing unintended or malicious instructions.

Recommendation:

  • Use a more robust natural language processing (NLP) model to detect semantic intent rather than relying solely on regex patterns.
  • Implement a layered defense approach, combining regex patterns with machine learning models trained on a diverse dataset of prompt injection attempts.
  • Regularly update the detection patterns to include new variations and techniques observed in the wild.

2. Policy Engine Circumvention

Finding: 🟠 HIGH

The new create_policies_from_config() API allows users to load security policies from YAML configuration files. While this provides flexibility, it introduces the risk of policy circumvention if the YAML files are not properly validated. For example:

  • Malicious or poorly formatted YAML files could disable critical security rules.
  • There is no evidence in the PR that the YAML files are validated against a strict schema.

Attack Vector:
An attacker with access to the YAML configuration files could introduce malicious changes, such as disabling key security rules or introducing overly permissive policies.

Recommendation:

  • Validate YAML files against a strict schema before loading them.
  • Implement a checksum or signature verification mechanism to ensure the integrity of the YAML files.
  • Log and alert on any changes to the YAML configuration files.

3. Trust Chain Weaknesses

Finding: 🔵 LOW

No direct changes in this PR affect trust chain mechanisms like SPIFFE/SVID validation or certificate pinning. However, the introduction of YAML-based configurations could indirectly impact trust if policies governing trust validation are weakened.

Recommendation:

  • Ensure that trust-related policies (e.g., SPIFFE/SVID validation) are not configurable via YAML or are subject to stricter validation and monitoring.

4. Credential Exposure

Finding: 🟠 HIGH

The cli-security-rules.yaml and pii-detection.yaml configurations include patterns for detecting hardcoded credentials (e.g., API keys, passwords). However:

  • There is no mechanism to prevent these patterns from being logged during detection.
  • If logs are not sanitized, sensitive information could be exposed.

Attack Vector:
Sensitive information detected by these patterns could be inadvertently logged, leading to credential exposure.

Recommendation:

  • Ensure that any logs related to these patterns are sanitized to remove sensitive information.
  • Implement a secure logging mechanism that redacts sensitive data before writing to logs.

5. Sandbox Escape

Finding: 🔴 CRITICAL

The sandbox-safety.yaml configuration blocks certain Python modules and builtins to prevent sandbox escapes. However:

  • The list of blocked modules and builtins is not exhaustive (e.g., os.system is not explicitly blocked, and subprocess can still be accessed via os.popen).
  • There is no indication that the sandbox enforces these restrictions at runtime.

Attack Vector:
An attacker could exploit unblocked modules or functions to execute arbitrary code or escape the sandbox environment.

Recommendation:

  • Use a secure sandboxing library (e.g., PyPy's sandboxing features or Docker containers) to enforce isolation.
  • Regularly review and update the list of blocked modules and builtins.
  • Implement runtime checks to ensure that restricted modules and functions are not being accessed.

6. Deserialization Attacks

Finding: 🟠 HIGH

The create_policies_from_config() API relies on YAML files for configuration. YAML deserialization can be exploited if the YAML parser is not configured securely. For example:

  • The yaml.load() function in PyYAML is vulnerable to arbitrary code execution if untrusted YAML is deserialized.

Attack Vector:
An attacker could craft a malicious YAML file that exploits unsafe deserialization to execute arbitrary code on the server.

Recommendation:

  • Use yaml.safe_load() instead of yaml.load() for deserialization.
  • If using a different YAML library, ensure it has safeguards against arbitrary code execution.

7. Race Conditions

Finding: 🟡 MEDIUM

The introduction of YAML-based configurations raises the potential for race conditions in policy loading. For example:

  • If multiple threads or processes attempt to load or modify the same YAML file simultaneously, it could lead to inconsistent or corrupted policies.

Attack Vector:
An attacker could exploit race conditions to introduce malicious changes to the policies or cause the application to crash.

Recommendation:

  • Implement file locking mechanisms to prevent concurrent access to YAML files.
  • Use atomic file operations to ensure that changes to YAML files are applied safely.

8. Supply Chain Risks

Finding: 🟠 HIGH

The PR introduces new dependencies for YAML parsing and AES-256-GCM encryption. However, there is no evidence that these dependencies have been audited for security vulnerabilities or that their integrity is verified during installation.

Attack Vector:
An attacker could exploit vulnerabilities in these dependencies or use dependency confusion/typosquatting attacks to introduce malicious code.

Recommendation:

  • Use a dependency scanning tool (e.g., Dependabot, Snyk) to identify and address vulnerabilities in dependencies.
  • Pin dependency versions to specific, known-good versions.
  • Verify the integrity of dependencies using checksums or signatures.

Summary of Findings

Category Severity Description
Prompt Injection Defense Bypass 🔴 CRITICAL Regex patterns are insufficient to detect advanced prompt injection techniques.
Policy Engine Circumvention 🟠 HIGH Lack of YAML validation allows potential policy circumvention.
Trust Chain Weaknesses 🔵 LOW No direct issues, but YAML configurations could indirectly weaken trust.
Credential Exposure 🟠 HIGH Detected sensitive data could be exposed in logs.
Sandbox Escape 🔴 CRITICAL Incomplete sandbox restrictions could allow escapes.
Deserialization Attacks 🟠 HIGH YAML deserialization may be unsafe.
Race Conditions 🟡 MEDIUM Potential for race conditions in YAML policy loading.
Supply Chain Risks 🟠 HIGH New dependencies lack evidence of auditing or integrity verification.

Recommendations

  1. Prompt Injection Defense: Use NLP models and layered defenses to detect semantic intent.
  2. Policy Validation: Enforce strict schema validation and integrity checks for YAML files.
  3. Secure Logging: Sanitize logs to prevent credential exposure.
  4. Sandbox Hardening: Use secure sandboxing libraries and runtime enforcement.
  5. Safe Deserialization: Use yaml.safe_load() or equivalent secure deserialization methods.
  6. Concurrency Control: Implement file locking and atomic operations for YAML files.
  7. Dependency Security: Audit new dependencies and verify their integrity.

This PR introduces significant improvements but also opens critical security gaps. Addressing these findings is essential before merging.

@github-actions
Copy link

🤖 AI Agent: test-generator

🧪 Test Coverage Analysis

packages/agent-os/src/agent_os/cli/__init__.py

  • Existing coverage: Basic CLI initialization and argument parsing are likely covered, as this is a core entry point for the CLI.
  • Missing coverage:
    • New functionality related to configurable security policies via YAML files.
    • Validation of YAML configurations for CLI security rules.
    • Error handling for malformed or missing YAML files.
  • 💡 Suggested test cases:
    1. test_load_valid_cli_security_config — Test loading a valid cli-security-rules.yaml file and ensure the rules are applied correctly.
    2. test_load_invalid_cli_security_config — Test loading a malformed YAML file and verify that appropriate exceptions are raised.
    3. test_missing_cli_security_config — Test behavior when the YAML configuration file is missing.
    4. test_cli_security_rule_enforcement — Simulate CLI commands that violate the security rules and verify that they are blocked.

packages/agent-os/src/agent_os/integrations/conversation_guardian.py

  • Existing coverage: Likely covers basic conversation monitoring and escalation detection.
  • Missing coverage:
    • New thresholds and patterns for escalation and offensive behavior detection.
    • Handling of edge cases like maximum retry cycles and transcript limits.
    • Detection of bypass directives and offensive patterns.
  • 💡 Suggested test cases:
    1. test_escalation_score_threshold — Verify that escalation scores above the threshold trigger appropriate actions.
    2. test_offensive_behavior_detection — Test detection of offensive patterns with varying severity levels.
    3. test_max_retry_cycles — Ensure the system halts after exceeding the maximum retry cycles.
    4. test_transcript_limit — Verify that the system correctly handles scenarios where the transcript exceeds the maximum allowed entries.
    5. test_bypass_directives_detection — Test detection of bypass directives in agent-to-agent conversations.

packages/agent-os/src/agent_os/mcp_security.py

  • Existing coverage: Likely covers basic MCP security checks.
  • Missing coverage:
    • Detection of new patterns like invisible Unicode characters, hidden comments, and encoded payloads.
    • Handling of suspicious decoded keywords.
    • Validation of YAML configuration for MCP security rules.
  • 💡 Suggested test cases:
    1. test_detect_invisible_unicode — Test detection of invisible Unicode characters in input.
    2. test_detect_hidden_comments — Verify detection of hidden comments in tool definitions.
    3. test_detect_encoded_payloads — Test detection of encoded payloads in tool definitions.
    4. test_suspicious_decoded_keywords — Verify detection of suspicious keywords in decoded content.
    5. test_load_valid_mcp_security_config — Test loading a valid mcp-security.yaml file and ensure the rules are applied correctly.
    6. test_load_invalid_mcp_security_config — Test loading a malformed YAML file and verify that appropriate exceptions are raised.

packages/agent-os/src/agent_os/mute_agent.py

  • Existing coverage: Likely covers basic muting functionality for agents.
  • Missing coverage:
    • Edge cases for muting agents during cascading failures or partial failures.
    • Concurrency issues like race conditions when muting multiple agents simultaneously.
  • 💡 Suggested test cases:
    1. test_mute_agent_partial_failure — Simulate a partial failure scenario and verify that the agent is muted correctly.
    2. test_mute_agent_concurrent_requests — Test muting multiple agents simultaneously to ensure no race conditions occur.
    3. test_mute_agent_timeout_handling — Verify behavior when muting an agent takes longer than expected.

packages/agent-os/src/agent_os/prompt_injection.py

  • Existing coverage: Likely covers basic prompt injection detection.
  • Missing coverage:
    • Detection of new patterns like direct overrides, role play, and context manipulation.
    • Handling of encoding patterns and suspicious decoded keywords.
    • Validation of YAML configuration for prompt injection safety rules.
  • 💡 Suggested test cases:
    1. test_detect_direct_override — Test detection of direct override patterns in prompts.
    2. test_detect_role_play_patterns — Verify detection of role play patterns in prompts.
    3. test_detect_context_manipulation — Test detection of context manipulation patterns in prompts.
    4. test_detect_encoding_patterns — Verify detection of encoded payloads in prompts.
    5. test_load_valid_prompt_injection_config — Test loading a valid prompt-injection-safety.yaml file and ensure the rules are applied correctly.
    6. test_load_invalid_prompt_injection_config — Test loading a malformed YAML file and verify that appropriate exceptions are raised.

packages/agent-os/src/agent_os/sandbox.py

  • Existing coverage: Likely covers basic sandboxing functionality.
  • Missing coverage:
    • Enforcement of new blocked modules and built-ins from the sandbox safety configuration.
    • Handling of attempts to bypass sandbox restrictions.
    • Validation of YAML configuration for sandbox safety rules.
  • 💡 Suggested test cases:
    1. test_blocked_modules_enforcement — Verify that blocked modules (e.g., subprocess, os) cannot be imported or used within the sandbox.
    2. test_blocked_builtins_enforcement — Test that blocked built-ins (e.g., exec, eval) cannot be executed within the sandbox.
    3. test_sandbox_bypass_attempt — Simulate an attempt to bypass sandbox restrictions and verify that it is blocked.
    4. test_load_valid_sandbox_safety_config — Test loading a valid sandbox-safety.yaml file and ensure the rules are applied correctly.
    5. test_load_invalid_sandbox_safety_config — Test loading a malformed YAML file and verify that appropriate exceptions are raised.

packages/agent-os/src/agent_os/semantic_policy.py

  • Existing coverage: Likely covers basic semantic policy evaluation.
  • Missing coverage:
    • Evaluation of new signal patterns for destructive data, exfiltration, and privilege escalation.
    • Handling of conflicting policies or overlapping signal patterns.
    • Validation of YAML configuration for semantic policies.
  • 💡 Suggested test cases:
    1. test_evaluate_destructive_data_signals — Test detection of destructive data patterns with varying weights.
    2. test_evaluate_data_exfiltration_signals — Verify detection of data exfiltration patterns with varying weights.
    3. test_evaluate_privilege_escalation_signals — Test detection of privilege escalation patterns with varying weights.
    4. test_conflicting_policies — Simulate scenarios with conflicting policies and verify the resolution logic.
    5. test_load_valid_semantic_policy_config — Test loading a valid semantic-policy.yaml file and ensure the rules are applied correctly.
    6. test_load_invalid_semantic_policy_config — Test loading a malformed YAML file and verify that appropriate exceptions are raised.

Summary

The changes in this release introduce significant new functionality, particularly around configurable security policies and enhanced detection mechanisms. While some existing tests may cover basic functionality, new test cases are required to ensure comprehensive coverage of the updated code paths, especially for edge cases like policy conflicts, malformed inputs, and attempts to bypass security mechanisms.

The XOR→AES-256-GCM upgrade in dmz.py requires the cryptography
package at runtime. Add it to dev dependencies so CI tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link

🤖 AI Agent: breaking-change-detector

🔍 API Compatibility Report

Summary

The release introduces several new features, security enhancements, and configurable policies. However, it also includes breaking changes due to deprecations and modifications to existing APIs. These changes may impact downstream users relying on deprecated or altered functionality.

Findings

Severity Package Change Impact
🔴 agent-runtime create_default_policies() deprecated with runtime warning Existing code using this function will encounter warnings; migration to create_policies_from_config() is required.
🔵 agent-runtime New create_policies_from_config() API Adds functionality for loading policies from YAML configurations.
🔵 agent-runtime New SQLPolicyConfig dataclass and load_sql_policy_config() function Adds structured policy loading capabilities.
🟡 agent-runtime Expanded SQL policy deny-list May alter behavior for users relying on previous SQL policy configurations.

Migration Guide

For create_default_policies() Deprecation:

  • Replace calls to create_default_policies() with create_policies_from_config() and provide explicit YAML configurations for security policies.
  • Refer to the provided sample configurations in examples/policies/ for guidance on creating YAML files.

For Expanded SQL Policy Deny-List:

  • Review the updated deny-list to ensure compatibility with your use case. The expanded list blocks additional SQL operations such as GRANT, REVOKE, CREATE USER, EXEC xp_cmdshell, UPDATE without WHERE, and MERGE INTO.

Conclusion

The release introduces valuable new features and security improvements but includes breaking changes that require migration for downstream users. Ensure proper documentation and communication of these changes to users.

@github-actions
Copy link

🤖 AI Agent: docs-sync-checker

📝 Documentation Sync Report

Issues Found

  1. create_policies_from_config() in packages/{name}/src/ — missing docstring
    • This is a new public API introduced in this release, but it lacks a docstring explaining its purpose, parameters, return values, and exceptions.
  2. ⚠️ packages/{name}/README.md — The README does not mention the new create_policies_from_config() API or the deprecation of create_default_policies(). Additionally, the new YAML-based configuration system and sample policy files in examples/policies/ are not documented in the README.
  3. ⚠️ CHANGELOG.md — While the changelog includes entries for most of the changes, it does not explicitly mention the deprecation of create_default_policies() or the runtime warning associated with it.
  4. ⚠️ examples/ — The example code in examples/ does not include any usage examples for the new create_policies_from_config() API. It would be helpful to include a Python script demonstrating how to use this new API with the sample YAML configurations.
  5. SQLPolicyConfig dataclass in packages/{name}/src/ — missing docstring
    • This new public dataclass is introduced but lacks a docstring explaining its purpose, attributes, and usage.
  6. load_sql_policy_config() in packages/{name}/src/ — missing docstring
    • This new public function is introduced but lacks a docstring explaining its purpose, parameters, return values, and exceptions.

Suggestions

  • 💡 Add a detailed docstring for create_policies_from_config() explaining its purpose, parameters, return values, and any exceptions it may raise.
  • 💡 Add a docstring for SQLPolicyConfig dataclass explaining its attributes and purpose.
  • 💡 Add a docstring for load_sql_policy_config() explaining its purpose, parameters, return values, and any exceptions it may raise.
  • 💡 Update the README to include:
    • Documentation for the new create_policies_from_config() API.
    • A note about the deprecation of create_default_policies() and the associated runtime warning.
    • A section explaining the new YAML-based configuration system and how to use the sample policy files in examples/policies/.
  • 💡 Add a changelog entry under the "Deprecated" section for create_default_policies() and mention the runtime warning.
  • 💡 Add a new example script in the examples/ directory demonstrating how to use the create_policies_from_config() API with the provided sample YAML configurations.
  • 💡 Ensure all new public APIs, including create_policies_from_config(), SQLPolicyConfig, and load_sql_policy_config(), have complete type annotations.

Additional Notes

  • The new YAML-based configuration files in examples/policies/ are well-documented with comments and disclaimers, which is great. However, the README should explicitly mention their purpose and how to use them with the new API.
  • The changelog is mostly comprehensive but could benefit from explicitly listing the deprecation of create_default_policies() and the runtime warning.

Action Items

  1. Add missing docstrings and type hints for the new public APIs (create_policies_from_config(), SQLPolicyConfig, and load_sql_policy_config()).
  2. Update the README to reflect the new features and deprecations.
  3. Add a changelog entry for the deprecation of create_default_policies().
  4. Add an example script in examples/ to demonstrate the usage of create_policies_from_config() with the new YAML configurations.

Let me know if you need further assistance!

@github-actions
Copy link

🤖 AI Agent: test-generator

🧪 Test Coverage Analysis

packages/agent-os/src/agent_os/cli/__init__.py

  • Existing coverage: Basic CLI initialization and argument parsing are likely covered if there are tests for CLI commands in the tests/agent_os/cli/ directory.
  • Missing coverage:
    • New functionality related to configurable security policies (e.g., create_policies_from_config()).
    • Deprecation of create_default_policies() and its runtime warning.
  • 💡 Suggested test cases:
    1. test_create_policies_from_config_valid_yaml — Test loading a valid YAML configuration file and ensure the policies are correctly initialized.
    2. test_create_policies_from_config_invalid_yaml — Test loading an invalid YAML file and verify appropriate error handling.
    3. test_create_default_policies_deprecation_warning — Ensure that calling create_default_policies() raises a deprecation warning.

packages/agent-os/src/agent_os/integrations/conversation_guardian.py

  • Existing coverage: Likely covers basic functionality of the conversation guardian module.
  • Missing coverage:
    • New thresholds and patterns for detecting escalating rhetoric, offensive intent, and feedback loops.
    • Edge cases for max_retry_cycles, max_conversation_turns, and loop_window_seconds.
  • 💡 Suggested test cases:
    1. test_escalation_score_threshold — Test behavior when escalation score is just below, at, and above the threshold.
    2. test_offensive_score_threshold — Test behavior when offensive score is just below, at, and above the threshold.
    3. test_max_retry_cycles_exceeded — Simulate a scenario where the maximum retry cycles are exceeded and verify the expected behavior.
    4. test_feedback_loop_detection — Test detection of feedback loops within the specified loop_window_seconds.

packages/agent-os/src/agent_os/mcp_security.py

  • Existing coverage: Likely covers basic functionality for MCP security checks.
  • Missing coverage:
    • Detection of new patterns such as invisible_unicode, hidden_comments, hidden_instructions, and encoded_payloads.
    • Edge cases for suspicious_decoded_keywords.
  • 💡 Suggested test cases:
    1. test_invisible_unicode_detection — Test detection of invisible Unicode characters in input.
    2. test_hidden_comments_detection — Test detection of hidden comments in input.
    3. test_encoded_payload_detection — Test detection of Base64-encoded or hex-encoded payloads.
    4. test_suspicious_decoded_keywords — Test behavior when decoded input contains suspicious keywords like "password" or "exec".

packages/agent-os/src/agent_os/mute_agent.py

  • Existing coverage: Likely covers basic functionality for muting agents.
  • Missing coverage:
    • Edge cases for concurrency (e.g., race conditions when multiple agents are muted simultaneously).
    • Handling of invalid or malformed input.
  • 💡 Suggested test cases:
    1. test_concurrent_agent_muting — Simulate multiple agents being muted concurrently and verify no race conditions occur.
    2. test_invalid_agent_id — Test behavior when an invalid or malformed agent ID is provided for muting.

packages/agent-os/src/agent_os/prompt_injection.py

  • Existing coverage: Likely covers basic prompt injection detection.
  • Missing coverage:
    • New detection patterns for direct_override, delimiter, role_play, context_manipulation, multi_turn, and encoding.
    • Edge cases for suspicious_decoded_keywords and sensitivity_thresholds.
  • 💡 Suggested test cases:
    1. test_direct_override_detection — Test detection of direct override patterns in input.
    2. test_delimiter_detection — Test detection of delimiter patterns in input.
    3. test_role_play_detection — Test detection of role-play patterns in input.
    4. test_sensitivity_thresholds — Test behavior when input matches patterns with scores just below, at, and above the sensitivity thresholds.

packages/agent-os/src/agent_os/sandbox.py

  • Existing coverage: Likely covers basic sandbox functionality.
  • Missing coverage:
    • New blocked modules and built-ins (e.g., subprocess, os, shutil, exec, eval).
    • Edge cases for attempts to bypass sandbox restrictions.
  • 💡 Suggested test cases:
    1. test_blocked_modules — Test that attempts to import blocked modules (e.g., subprocess, os) are correctly prevented.
    2. test_blocked_builtins — Test that attempts to use blocked built-ins (e.g., exec, eval) are correctly prevented.
    3. test_sandbox_bypass_attempt — Simulate an attempt to bypass sandbox restrictions (e.g., using importlib) and verify it is blocked.

packages/agent-os/src/agent_os/semantic_policy.py

  • Existing coverage: Likely covers basic semantic policy evaluation.
  • Missing coverage:
    • New signal patterns for destructive_data, data_exfiltration, and other categories.
    • Edge cases for conflicting policies or overlapping patterns.
  • 💡 Suggested test cases:
    1. test_destructive_data_detection — Test detection of destructive data patterns (e.g., DROP TABLE, DELETE FROM).
    2. test_data_exfiltration_detection — Test detection of data exfiltration patterns (e.g., SELECT INTO OUTFILE, pg_dump).
    3. test_conflicting_policies — Simulate conflicting policies (e.g., one policy allows an action while another denies it) and verify the resolution behavior.
    4. test_overlapping_patterns — Test behavior when input matches multiple overlapping patterns with different weights.

Summary

The release introduces significant changes, particularly around configurable security policies and expanded detection patterns. While some existing tests may cover basic functionality, new test cases are required to ensure comprehensive coverage of the new features and edge cases.

@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Mar 18, 2026
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This pull request introduces significant updates to the microsoft/agent-governance-toolkit repository, including configurable security policies, expanded SQL deny-lists, cryptographic improvements, and ESRP compliance. While these changes enhance the library's security posture and flexibility, they also introduce potential risks and breaking changes that require careful review.


🔴 CRITICAL Issues

  1. Sandbox Escape via Dynamic Imports

    • File: examples/policies/sandbox-safety.yaml
    • Issue: The sandbox configuration blocks importlib and __import__, but does not account for indirect dynamic imports via types.ModuleType or sys.modules. Attackers could bypass restrictions using these mechanisms.
    • Recommendation: Extend the sandbox policy to detect and block indirect dynamic imports. Consider implementing runtime checks for sys.modules manipulation.
  2. AES-256-GCM Implementation Validation

    • File: Not explicitly shown in the diff, but mentioned in the description.
    • Issue: The replacement of XOR encryption with AES-256-GCM is a critical improvement. However, the implementation details are not provided in the diff. Incorrect usage of AES-GCM (e.g., reusing nonces) can lead to catastrophic security failures.
    • Recommendation: Verify that the AES-GCM implementation uses unique nonces for every encryption operation and securely handles key management.
  3. Policy Engine False Negatives

    • File: examples/policies/semantic-policy.yaml, examples/policies/prompt-injection-safety.yaml
    • Issue: The regex patterns for detecting malicious behavior (e.g., SQL injection, prompt injection) may produce false negatives due to overly specific matching criteria. For example, patterns like '\bDROP\s+(TABLE|DATABASE|INDEX|VIEW|SCHEMA)\b' may miss obfuscated or unconventional SQL syntax.
    • Recommendation: Enhance regex patterns to account for obfuscation techniques (e.g., whitespace variations, comments, concatenation). Consider integrating semantic analysis or AST-based validation for more robust detection.
  4. Thread Safety in Concurrent Agent Execution

    • File: Not explicitly shown in the diff, but relevant to the policy engine changes.
    • Issue: The introduction of YAML-based configurable policies raises concerns about thread safety during concurrent agent execution. If policies are dynamically loaded or modified at runtime, race conditions could occur.
    • Recommendation: Ensure that policy loading and execution are thread-safe. Use locks or immutable data structures to prevent concurrent modification.

🟡 WARNING: Potential Breaking Changes

  1. Deprecation of create_default_policies()

    • File: Not explicitly shown in the diff, but mentioned in the description.
    • Issue: The deprecation of create_default_policies() introduces a runtime warning, which may break existing workflows relying on this method.
    • Recommendation: Provide a migration guide in the release notes, including examples of how to transition to create_policies_from_config().
  2. Expanded SQL Deny-List

    • File: CHANGELOG.md
    • Issue: Blocking additional SQL commands (e.g., GRANT, REVOKE, CREATE USER) may cause existing applications to fail if they rely on these operations.
    • Recommendation: Clearly document the changes in the release notes and provide guidance on how users can customize the deny-list to suit their needs.

💡 Suggestions for Improvement

  1. Policy Validation Framework

    • File: examples/policies/
    • Suggestion: Implement a validation framework to ensure that YAML policy files conform to expected schemas and do not contain syntax errors or invalid configurations. Use Pydantic models for schema validation.
  2. OWASP Agentic Top 10 Compliance

    • File: examples/policies/semantic-policy.yaml, examples/policies/conversation-guardian.yaml
    • Suggestion: Expand the policy configurations to address OWASP Agentic Top 10 risks comprehensively. For example, include rules for detecting unauthorized data aggregation (ASI-6) and feedback loop exploitation (ASI-10).
  3. Backward Compatibility Layer

    • File: Not explicitly shown in the diff.
    • Suggestion: Provide a backward compatibility layer for deprecated APIs like create_default_policies() to minimize disruption for existing users.
  4. Documentation Enhancements

    • File: SECURITY.md, README.md
    • Suggestion: Include detailed examples of how to use the new create_policies_from_config() API and YAML configurations. Highlight best practices for customizing policies.
  5. Test Coverage for Configurable Policies

    • File: Not explicitly shown in the diff.
    • Suggestion: Add comprehensive test cases for the new YAML-based policy configurations to ensure correctness and prevent regressions.

Final Assessment

  • Security: 🔴 Critical issues identified, including sandbox escape vectors and potential AES-GCM misuse.
  • Backward Compatibility: 🟡 Breaking changes due to API deprecation and expanded deny-lists.
  • Improvements: 💡 Suggestions provided for enhancing policy robustness, documentation, and test coverage.

Action Required: Address critical issues before merging. Ensure thorough testing and documentation updates to mitigate risks and ease migration for existing users.

@github-actions
Copy link

🤖 AI Agent: security-scanner

Security Review of PR: release: v2.2.0 — ESRP compliance, configurable security policies, hardening


1. Prompt Injection Defense Bypass

Severity: 🔴 CRITICAL

Issue:
The prompt-injection-safety.yaml configuration introduces detection patterns for prompt injection attacks. However, the patterns are primarily regex-based and may not comprehensively cover all possible variations of prompt injection attacks. For instance, adversaries can use obfuscation techniques (e.g., Unicode homoglyphs, encoding, or splitting keywords) to bypass these regex patterns. Additionally, the suspicious_decoded_keywords list does not account for all possible variations of malicious instructions.

Attack Vector:
An attacker could craft a prompt that bypasses the regex patterns by using slight variations in text (e.g., "ign0re all prev1ous instructi0ns" or "ignоre all previоus instructions" using Cyrillic 'о'). This could lead to the agent executing unintended or malicious instructions.

Recommendation:

  • Implement a more robust semantic analysis engine that uses natural language processing (NLP) to detect intent rather than relying solely on regex patterns.
  • Incorporate a mechanism to detect obfuscated or encoded payloads (e.g., base64, hex, or Unicode homoglyphs) and decode them for further analysis.
  • Regularly update the detection patterns and keywords based on emerging prompt injection techniques.

2. Policy Engine Circumvention

Severity: 🟠 HIGH

Issue:
The new create_policies_from_config() API allows security policies to be externalized to YAML files. While this improves flexibility, it introduces the risk of policy circumvention if the YAML files are tampered with or improperly validated. For example, an attacker with access to the configuration files could weaken or disable critical security rules.

Attack Vector:
If an attacker gains access to the YAML configuration files, they could modify or remove critical rules (e.g., disabling the prompt-injection-safety module or reducing sensitivity thresholds). This could lead to a complete bypass of the security layer.

Recommendation:

  • Implement strong integrity checks for the YAML configuration files, such as digital signatures or checksums, to detect unauthorized modifications.
  • Enforce strict validation of the YAML files during loading to ensure all required rules and thresholds are present and meet minimum security standards.
  • Consider restricting access to the configuration files using file system permissions or other access control mechanisms.

3. Trust Chain Weaknesses

Severity: 🔵 LOW

Issue:
The PR does not explicitly address the use of SPIFFE/SVID for trust chain validation or certificate pinning. While this may not be directly relevant to the changes introduced in this PR, it is worth noting that the absence of such mechanisms could expose the system to trust-related vulnerabilities.

Attack Vector:
Without SPIFFE/SVID validation or certificate pinning, an attacker could potentially impersonate a trusted entity in the system, leading to unauthorized access or data exfiltration.

Recommendation:

  • Ensure that all communication between components in the system is secured using mutual TLS with SPIFFE/SVID for identity verification.
  • Implement certificate pinning to prevent man-in-the-middle attacks.

4. Credential Exposure

Severity: 🟡 MEDIUM

Issue:
The cli-security-rules.yaml and pii-detection.yaml configurations include patterns for detecting hardcoded secrets (e.g., API keys, passwords, private keys). However, there is no indication that the toolkit itself prevents logging of sensitive data during runtime.

Attack Vector:
If the toolkit logs sensitive data (e.g., API keys or PII) during execution, this could lead to accidental exposure of sensitive information in logs, which could be exploited by attackers.

Recommendation:

  • Implement a mechanism to sanitize sensitive data before logging.
  • Provide clear documentation and examples to users on how to configure logging to avoid exposing sensitive information.
  • Consider adding runtime checks to detect and prevent logging of sensitive data.

5. Sandbox Escape

Severity: 🔴 CRITICAL

Issue:
The sandbox-safety.yaml configuration defines a list of blocked Python modules and builtins to prevent sandbox escapes. However, the list is not exhaustive and does not account for all potential escape vectors. For example, the os module is blocked, but os.path is not explicitly mentioned. Similarly, other modules like sys (which can be used to manipulate the Python runtime) are not included.

Attack Vector:
An attacker could exploit unblocked modules or builtins to escape the sandbox and execute arbitrary code on the host system. For example, they could use os.path to access the file system or sys.modules to reload blocked modules.

Recommendation:

  • Perform a comprehensive review of all Python modules and builtins that could be used for sandbox escapes and update the sandbox-safety.yaml configuration accordingly.
  • Consider implementing a whitelist-based approach (only allow explicitly safe modules and builtins) instead of a blacklist-based approach.
  • Regularly update the list of blocked modules and builtins based on new security research and emerging threats.

6. Deserialization Attacks

Severity: 🟠 HIGH

Issue:
The create_policies_from_config() API relies on YAML files for configuration. YAML deserialization is known to be vulnerable to code execution attacks if not properly handled, especially when using libraries like PyYAML with yaml.load().

Attack Vector:
An attacker could craft a malicious YAML file containing arbitrary Python objects, which could be executed during deserialization, leading to remote code execution.

Recommendation:

  • Use yaml.safe_load() instead of yaml.load() to prevent deserialization of arbitrary objects.
  • Validate the structure and content of the YAML files against a predefined schema before processing them.
  • Consider using a safer configuration format, such as JSON, which has fewer deserialization risks.

7. Race Conditions

Severity: 🔵 LOW

Issue:
The PR does not explicitly address potential race conditions in policy checks or trust evaluations. While this may not be directly relevant to the changes introduced in this PR, it is worth considering the potential for time-of-check-to-time-of-use (TOCTOU) vulnerabilities.

Attack Vector:
An attacker could exploit a race condition to bypass security policies by modifying input or configuration files between the time they are checked and the time they are used.

Recommendation:

  • Implement atomic operations for policy checks and trust evaluations to prevent race conditions.
  • Use file locks or other synchronization mechanisms to prevent concurrent modifications to configuration files.

8. Supply Chain

Severity: 🟠 HIGH

Issue:
The PR introduces new dependencies for YAML configuration parsing and ESRP publishing. However, there is no indication that these dependencies have been audited for security vulnerabilities or that measures have been taken to mitigate supply chain risks (e.g., dependency confusion or typosquatting).

Attack Vector:
An attacker could exploit a vulnerable or malicious dependency to compromise the toolkit or its users. For example, a dependency could contain malicious code that exfiltrates sensitive data or introduces backdoors.

Recommendation:

  • Use a dependency scanning tool (e.g., Dependabot, Snyk, or OWASP Dependency-Check) to identify and address known vulnerabilities in dependencies.
  • Pin dependencies to specific versions to prevent unintentional updates to potentially malicious versions.
  • Verify the integrity of dependencies using checksums or signatures.
  • Consider using tools like pip-audit or npm audit to regularly audit dependencies for vulnerabilities.

Summary of Findings

Finding Severity Recommendation
Prompt injection defense bypass 🔴 CRITICAL Use NLP-based semantic analysis and improve regex patterns to detect obfuscation.
Policy engine circumvention 🟠 HIGH Add integrity checks, validation, and access controls for YAML configuration files.
Trust chain weaknesses 🔵 LOW Ensure SPIFFE/SVID validation and certificate pinning are implemented.
Credential exposure 🟡 MEDIUM Sanitize sensitive data before logging and provide guidance on secure logging.
Sandbox escape 🔴 CRITICAL Use a whitelist-based approach for allowed modules and builtins.
Deserialization attacks 🟠 HIGH Use yaml.safe_load() and validate YAML files against a schema.
Race conditions 🔵 LOW Implement atomic operations and file locks for policy checks and config updates.
Supply chain 🟠 HIGH Audit dependencies, pin versions, and verify integrity using checksums/signatures.

Final Recommendation

This PR introduces significant improvements to the toolkit's security posture, but it also introduces critical risks, particularly around prompt injection defenses, sandbox escapes, and deserialization. These issues must be addressed before merging to ensure the toolkit remains a robust security layer for downstream users.

@imran-siddique imran-siddique merged commit 3b88941 into microsoft:main Mar 18, 2026
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation size/XL Extra large PR (500+ lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant