release: v2.2.0 — ESRP compliance, configurable security policies, hardening#300
Conversation
Fix incomplete deny-list in no_destructive_sql policy that allowed destructive SQL operations to bypass the policy engine (MSRC report). Previously only DROP, TRUNCATE, DELETE (no WHERE), and ALTER TABLE were blocked. Now blocks in both AST (sqlglot) and fallback (regex) paths: - GRANT / REVOKE (privilege escalation) - CREATE/ALTER/DROP USER/ROLE/LOGIN (account manipulation) - UPDATE without WHERE (mass data modification) - EXEC/EXECUTE xp_cmdshell, sp_configure (OS command execution) - MERGE INTO (combined insert/update/delete) - LOAD DATA, INTO OUTFILE/DUMPFILE (file operations) Also fixes pre-existing test issues (ExecutionRequest constructor) and adds 21 new test cases covering all reported bypass vectors. 35 tests passing, 0 failures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Refactor the policy engine so SQL deny-rules are driven by external YAML configuration rather than hardcoded defaults. This shifts responsibility to users to review and customize policies for their specific environment. Changes: - Add SQLPolicyConfig dataclass for structured policy configuration - Add load_sql_policy_config() to load from YAML files - Add create_policies_from_config() as the recommended entry point - Deprecate create_default_policies() with runtime warning directing users to explicit config files - _fallback_sql_check() now accepts optional SQLPolicyConfig and builds regex patterns dynamically from config Sample policy configs (examples/policies/): - sql-safety.yaml — balanced default (blocks DROP/GRANT/etc.) - sql-strict.yaml — high-security (SELECT-only) - sql-readonly.yaml — read-only agents All configs include prominent disclaimers that they are SAMPLES and must be reviewed before production use. 40 tests passing (5 new config tests), 0 failures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Make all security detection patterns, deny-lists, and thresholds configurable via external YAML files. Built-in defaults are retained but emit deprecation warnings directing users to explicit config files. New sample configs in examples/policies/: - sandbox-safety.yaml - prompt-injection-safety.yaml - mcp-security.yaml - semantic-policy.yaml - pii-detection.yaml - conversation-guardian.yaml - cli-security-rules.yaml All configs include prominent disclaimers that they are SAMPLES and must be reviewed before production use. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🤖 AI Agent: docs-sync-checker📝 Documentation Sync ReportIssues Found
Suggestions
Additional Notes
Please address the above issues to ensure documentation and examples are in sync with the new changes. Let me know if you need further assistance! |
🤖 AI Agent: breaking-change-detector🔍 API Compatibility ReportSummaryThe release introduces several new features and configurations, including externalized security policies and ESRP compliance. While these changes are largely additive, there are some potentially breaking changes due to deprecations and modifications to existing APIs. No outright breaking changes were identified, but there are changes that could impact downstream users depending on their usage patterns. Findings
Migration Guide
Conclusion✅ No breaking changes detected. However, the deprecation of |
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Pull Request Review: release: v2.2.0 — ESRP compliance, configurable security policies, hardening
Summary
This pull request introduces significant updates to the microsoft/agent-governance-toolkit repository, including configurable security policies, ESRP compliance, and cryptographic hardening. While the changes are generally positive, there are several areas that require attention to ensure correctness, security, and backward compatibility.
🔴 CRITICAL: Security Issues
-
Regex Patterns for Sensitive Data Detection
- Issue: Some regex patterns for detecting sensitive data (e.g., API keys, passwords, private keys) in the sample configurations (
cli-security-rules.yaml,pii-detection.yaml) are overly simplistic and may result in false negatives. For example:- The regex for detecting hardcoded passwords (
'(password|passwd|pwd)\s*[=:]\s*["\u0027][^"\u0027]+["\u0027]') does not account for cases where passwords are stored in variables without quotes or are concatenated dynamically. - The regex for detecting AWS Access Key IDs (
'AKIA[0-9A-Z]{16}') does not account for other formats of access keys or potential obfuscation techniques.
- The regex for detecting hardcoded passwords (
- Recommendation: Use more robust regex patterns or integrate specialized libraries like truffleHog for secret scanning.
- Issue: Some regex patterns for detecting sensitive data (e.g., API keys, passwords, private keys) in the sample configurations (
-
Sandbox Escape Vectors
- Issue: The
sandbox-safety.yamlconfiguration blocks dangerous modules and builtins, but it does not account for indirect imports or alternative methods of code execution. For example:importlibis blocked, but__import__can still be used to dynamically load modules.- Blocking
exec,eval, andcompileis good, but other methods likeexecfile()(Python 2 compatibility) orpickle.loads()can also execute arbitrary code.
- Recommendation: Consider using a more comprehensive sandboxing solution, such as PyPy sandboxing or RestrictedPython, to enforce stricter controls.
- Issue: The
-
AES-256-GCM Implementation
- Issue: The pull request mentions replacing XOR encryption with AES-256-GCM in the DMZ module. However, there is no code snippet provided for review. Improper implementation of AES encryption (e.g., reusing nonces or keys) can lead to vulnerabilities.
- Recommendation: Ensure that:
- Nonces are unique for every encryption operation.
- Keys are securely generated and stored using a key management system (e.g., Azure Key Vault).
- Cryptographic operations are audited for compliance with NIST standards.
-
Thread Safety in Concurrent Agent Execution
- Issue: The pull request mentions thread safety improvements but does not provide details. If agents share mutable state or resources, race conditions could lead to security vulnerabilities.
- Recommendation: Verify that all shared resources (e.g., policy configurations, cryptographic keys) are properly synchronized using thread-safe mechanisms like locks or thread-local storage.
🟡 WARNING: Potential Breaking Changes
-
Deprecation of
create_default_policies()- Issue: Deprecating the
create_default_policies()API introduces a runtime warning, which may disrupt existing workflows that rely on this function. - Recommendation: Provide a clear migration path in the documentation, including examples of how to transition to
create_policies_from_config().
- Issue: Deprecating the
-
Policy Configuration Externalization
- Issue: Externalizing security policies to YAML files changes the default behavior of the library. Existing users may experience unexpected behavior if they do not update their configurations.
- Recommendation: Add backward compatibility by allowing users to opt-in to the new configuration system while maintaining support for hardcoded defaults.
💡 Suggestions for Improvement
-
OWASP Agentic Top 10 Compliance
- Observation: The new policies address several OWASP Agentic Top 10 risks (e.g., ASI-8, ASI-10, LLM01). However, additional risks like ASI-2 (Data Poisoning) and ASI-6 (Over-reliance on AI) could be addressed.
- Recommendation: Expand the policy configurations to include rules for detecting data poisoning attempts (e.g., adversarial examples) and scenarios where agents may over-rely on AI-generated outputs.
-
Policy Validation
- Observation: The YAML configurations are critical for security, but there is no mention of validation or schema enforcement.
- Recommendation: Use Pydantic models to validate the structure and content of the YAML files before loading them into the policy engine.
-
Documentation
- Observation: While disclaimers are included in the sample configurations, the documentation does not provide sufficient guidance on customizing policies for specific use cases.
- Recommendation: Expand the documentation to include:
- Best practices for writing custom policies.
- Examples of real-world scenarios and how to address them using the toolkit.
- Guidance on testing and validating policies.
-
Testing Coverage
- Observation: The pull request does not include updates to the test suite for the new features.
- Recommendation: Add unit tests for:
- The
create_policies_from_config()API. - Validation of YAML configurations.
- Cryptographic operations in the DMZ module.
- Thread safety in concurrent agent execution.
- The
-
Backward Compatibility
- Observation: Deprecating
create_default_policies()without a clear migration path may alienate existing users. - Recommendation: Maintain backward compatibility by providing a transitional period where both APIs are supported.
- Observation: Deprecating
Conclusion
This pull request introduces valuable features and improvements, but there are critical security issues and potential breaking changes that need to be addressed before merging. The recommendations provided above should be implemented to ensure the robustness, security, and usability of the library.
Actions Required:
- Address the 🔴 CRITICAL security issues.
- Mitigate 🟡 WARNING potential breaking changes.
- Implement 💡 Suggestions for improvement.
Let me know if you need further assistance or clarification!
🤖 AI Agent: test-generator🧪 Test Coverage Analysis
|
🤖 AI Agent: security-scannerSecurity Review for
|
| Finding | Severity | Recommendation |
|---|---|---|
| Prompt injection defense bypass | 🔴 CRITICAL | Use semantic analysis and NLP for intent detection; update patterns regularly. |
| Policy engine circumvention | 🟠 HIGH | Validate YAML schemas; enforce access controls and monitor changes. |
| Trust chain weaknesses | 🔵 LOW | Use certificate pinning and secure signing for ESRP publishing infrastructure. |
| Credential exposure | 🔵 LOW | Add automated tests for credential detection patterns; provide secure practices. |
| Sandbox escape | 🟠 HIGH | Use deny-by-default for modules/functions; consider secure sandboxing frameworks. |
| Deserialization attacks | 🟠 HIGH | Use safe YAML parsers and validate against strict schemas. |
| Race conditions | 🟡 MEDIUM | Implement thread-safe mechanisms; test for concurrent access issues. |
| Supply chain risks | 🟠 HIGH | Use dependency pinning, audit dependencies, and implement package signing. |
Final Recommendation
While this pull request introduces valuable security enhancements, it also introduces potential risks, particularly in the areas of prompt injection defense, policy engine circumvention, sandbox escapes, deserialization attacks, and supply chain security. Addressing these issues is critical to ensure the robustness of the microsoft/agent-governance-toolkit.
…NGELOG Address PR review feedback — add create_policies_from_config() API, SQLPolicyConfig, 10 sample configs, and create_default_policies() deprecation to the v2.2.0 CHANGELOG section. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🤖 AI Agent: breaking-change-detector🔍 API Compatibility ReportSummaryThis release introduces several new features, security enhancements, and infrastructure updates. While most changes are additive, there are some potentially breaking changes due to deprecations and modifications to existing APIs. No outright breaking changes were detected, but caution is advised for deprecated functionality. Findings
Migration GuideFor
|
🤖 AI Agent: docs-sync-checker📝 Documentation Sync ReportIssues Found
Suggestions
Type Hints
Final AssessmentThe documentation is not yet in sync. Please address the issues and suggestions above to ensure consistency and clarity across the repository. |
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Pull Request Review: Release v2.2.0
This release introduces significant updates to the microsoft/agent-governance-toolkit repository, including security fixes, configurable security policies, ESRP compliance, and hardening measures. Below is a detailed review of the changes based on the focus areas.
🔴 CRITICAL: Security Issues
-
AES-256-GCM Implementation
- The replacement of XOR placeholder encryption with AES-256-GCM is a critical improvement. However, ensure that:
- The AES key is securely generated using a cryptographically secure random number generator (e.g.,
os.urandomorsecretsmodule). - The nonce/IV is unique for every encryption operation to prevent vulnerabilities like nonce reuse attacks.
- Key management practices are robust (e.g., keys stored in a secure vault).
- The AES key is securely generated using a cryptographically secure random number generator (e.g.,
- Action: Verify the implementation of AES-256-GCM in the DMZ module for adherence to cryptographic best practices.
- The replacement of XOR placeholder encryption with AES-256-GCM is a critical improvement. However, ensure that:
-
Sandbox Escape Vectors
- The
sandbox-safety.yamlconfiguration blocks dangerous modules (subprocess,os,shutil, etc.) and built-ins (exec,eval, etc.). However:- Ensure that dynamic imports via
importlibor__import__are comprehensively blocked. - Validate that the sandbox cannot be bypassed through indirect means (e.g., loading malicious code via
pickleormarshal).
- Ensure that dynamic imports via
- Action: Conduct penetration testing to confirm the sandbox's robustness against escape vectors.
- The
-
SQL Policy Deny-List Expansion
- Blocking dangerous SQL operations like
GRANT,REVOKE,CREATE USER,EXEC xp_cmdshell, andUPDATEwithoutWHEREis essential. However:- Ensure that the deny-list is applied consistently across all modules where SQL queries are processed.
- Validate that the deny-list cannot be bypassed by obfuscation techniques (e.g., using comments or concatenation).
- Action: Add unit tests to verify that all deny-list patterns are correctly enforced.
- Blocking dangerous SQL operations like
-
Prompt Injection Safety
- The
prompt-injection-safety.yamlconfiguration includes patterns for detecting prompt injection attacks. However:- Ensure that the detection logic accounts for encoded payloads (e.g., Base64, hex) and multi-turn manipulation.
- Validate that the sensitivity thresholds (
strict,balanced,permissive) are correctly applied in runtime.
- Action: Perform adversarial testing to confirm the effectiveness of prompt injection detection.
- The
🟡 WARNING: Breaking Changes
-
Deprecation of
create_default_policies()- The deprecation of
create_default_policies()in favor ofcreate_policies_from_config()introduces a runtime warning. This change may break existing integrations that rely on the default policy creation method. - Action: Provide clear migration instructions in the documentation and ensure backward compatibility by maintaining the deprecated method for at least one major release cycle.
- The deprecation of
-
Policy Configuration Externalization
- Externalizing security policies to YAML files is a significant architectural change. Ensure that:
- Existing users are informed about the migration path.
- The new configuration system is backward-compatible with hardcoded defaults.
- Action: Add a fallback mechanism to load hardcoded defaults if YAML files are missing or invalid.
- Externalizing security policies to YAML files is a significant architectural change. Ensure that:
💡 Suggestions for Improvement
-
Thread Safety in Concurrent Agent Execution
- While thread safety issues are mentioned in the security advisories, ensure that:
- Shared resources (e.g., policy configurations, cryptographic keys) are protected using synchronization primitives like locks or semaphores.
- The codebase is free from race conditions, especially in modules handling concurrent agent execution.
- Action: Use tools like
pytest-xdistto simulate concurrent execution and identify thread safety issues.
- While thread safety issues are mentioned in the security advisories, ensure that:
-
OWASP Agentic Top 10 Compliance
- The release addresses several OWASP Agentic Top 10 risks (e.g., ASI-1, ASI-8, ASI-10). However:
- Consider adding explicit tests for ASI-2 (Agent Identity Spoofing) and ASI-7 (Agent Memory Manipulation).
- Expand the
conversation-guardian.yamlconfiguration to detect memory manipulation attempts.
- Action: Add compliance tests for all OWASP Agentic Top 10 risks.
- The release addresses several OWASP Agentic Top 10 risks (e.g., ASI-1, ASI-8, ASI-10). However:
-
Type Safety and Pydantic Model Validation
- The introduction of
SQLPolicyConfigandload_sql_policy_config()is a positive step. Ensure that:- All YAML configurations are validated using Pydantic models.
- Validation errors are logged and handled gracefully.
- Action: Add unit tests to verify the correctness of Pydantic model validation.
- The introduction of
-
Policy Examples
- The addition of 10 sample policy configurations is helpful. However:
- Include detailed documentation for each policy, explaining its purpose, usage, and limitations.
- Provide guidelines for customizing the policies based on specific use cases.
- Action: Expand the
examples/policies/directory with README files for each policy.
- The addition of 10 sample policy configurations is helpful. However:
Summary of Actions
Critical
- Verify AES-256-GCM implementation for cryptographic best practices.
- Test sandbox for escape vectors and indirect bypass methods.
- Add unit tests for SQL deny-list enforcement.
- Perform adversarial testing for prompt injection detection.
Warning
- Provide migration instructions for
create_default_policies()deprecation. - Ensure backward compatibility for policy configuration externalization.
Suggestions
- Test thread safety in concurrent agent execution.
- Add OWASP Agentic Top 10 compliance tests.
- Validate YAML configurations using Pydantic models.
- Document and expand sample policy configurations.
Final Verdict
This release introduces critical security improvements and valuable features, but it also carries potential breaking changes and areas requiring further validation. Address the flagged issues and suggestions to ensure a robust and secure release.
🤖 AI Agent: security-scannerSecurity Review for PR:
|
| Category | Severity | Description |
|---|---|---|
| Prompt Injection Defense Bypass | 🔴 CRITICAL | Regex patterns are insufficient to detect advanced prompt injection techniques. |
| Policy Engine Circumvention | 🟠 HIGH | Lack of YAML validation allows potential policy circumvention. |
| Trust Chain Weaknesses | 🔵 LOW | No direct issues, but YAML configurations could indirectly weaken trust. |
| Credential Exposure | 🟠 HIGH | Detected sensitive data could be exposed in logs. |
| Sandbox Escape | 🔴 CRITICAL | Incomplete sandbox restrictions could allow escapes. |
| Deserialization Attacks | 🟠 HIGH | YAML deserialization may be unsafe. |
| Race Conditions | 🟡 MEDIUM | Potential for race conditions in YAML policy loading. |
| Supply Chain Risks | 🟠 HIGH | New dependencies lack evidence of auditing or integrity verification. |
Recommendations
- Prompt Injection Defense: Use NLP models and layered defenses to detect semantic intent.
- Policy Validation: Enforce strict schema validation and integrity checks for YAML files.
- Secure Logging: Sanitize logs to prevent credential exposure.
- Sandbox Hardening: Use secure sandboxing libraries and runtime enforcement.
- Safe Deserialization: Use
yaml.safe_load()or equivalent secure deserialization methods. - Concurrency Control: Implement file locking and atomic operations for YAML files.
- Dependency Security: Audit new dependencies and verify their integrity.
This PR introduces significant improvements but also opens critical security gaps. Addressing these findings is essential before merging.
🤖 AI Agent: test-generator🧪 Test Coverage Analysis
|
The XOR→AES-256-GCM upgrade in dmz.py requires the cryptography package at runtime. Add it to dev dependencies so CI tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🤖 AI Agent: breaking-change-detector🔍 API Compatibility ReportSummaryThe release introduces several new features, security enhancements, and configurable policies. However, it also includes breaking changes due to deprecations and modifications to existing APIs. These changes may impact downstream users relying on deprecated or altered functionality. Findings
Migration GuideFor
|
🤖 AI Agent: docs-sync-checker📝 Documentation Sync ReportIssues Found
Suggestions
Additional Notes
Action Items
Let me know if you need further assistance! |
🤖 AI Agent: test-generator🧪 Test Coverage Analysis
|
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces significant updates to the microsoft/agent-governance-toolkit repository, including configurable security policies, expanded SQL deny-lists, cryptographic improvements, and ESRP compliance. While these changes enhance the library's security posture and flexibility, they also introduce potential risks and breaking changes that require careful review.
🔴 CRITICAL Issues
-
Sandbox Escape via Dynamic Imports
- File:
examples/policies/sandbox-safety.yaml - Issue: The sandbox configuration blocks
importliband__import__, but does not account for indirect dynamic imports viatypes.ModuleTypeorsys.modules. Attackers could bypass restrictions using these mechanisms. - Recommendation: Extend the sandbox policy to detect and block indirect dynamic imports. Consider implementing runtime checks for
sys.modulesmanipulation.
- File:
-
AES-256-GCM Implementation Validation
- File: Not explicitly shown in the diff, but mentioned in the description.
- Issue: The replacement of XOR encryption with AES-256-GCM is a critical improvement. However, the implementation details are not provided in the diff. Incorrect usage of AES-GCM (e.g., reusing nonces) can lead to catastrophic security failures.
- Recommendation: Verify that the AES-GCM implementation uses unique nonces for every encryption operation and securely handles key management.
-
Policy Engine False Negatives
- File:
examples/policies/semantic-policy.yaml,examples/policies/prompt-injection-safety.yaml - Issue: The regex patterns for detecting malicious behavior (e.g., SQL injection, prompt injection) may produce false negatives due to overly specific matching criteria. For example, patterns like
'\bDROP\s+(TABLE|DATABASE|INDEX|VIEW|SCHEMA)\b'may miss obfuscated or unconventional SQL syntax. - Recommendation: Enhance regex patterns to account for obfuscation techniques (e.g., whitespace variations, comments, concatenation). Consider integrating semantic analysis or AST-based validation for more robust detection.
- File:
-
Thread Safety in Concurrent Agent Execution
- File: Not explicitly shown in the diff, but relevant to the policy engine changes.
- Issue: The introduction of YAML-based configurable policies raises concerns about thread safety during concurrent agent execution. If policies are dynamically loaded or modified at runtime, race conditions could occur.
- Recommendation: Ensure that policy loading and execution are thread-safe. Use locks or immutable data structures to prevent concurrent modification.
🟡 WARNING: Potential Breaking Changes
-
Deprecation of
create_default_policies()- File: Not explicitly shown in the diff, but mentioned in the description.
- Issue: The deprecation of
create_default_policies()introduces a runtime warning, which may break existing workflows relying on this method. - Recommendation: Provide a migration guide in the release notes, including examples of how to transition to
create_policies_from_config().
-
Expanded SQL Deny-List
- File:
CHANGELOG.md - Issue: Blocking additional SQL commands (e.g.,
GRANT,REVOKE,CREATE USER) may cause existing applications to fail if they rely on these operations. - Recommendation: Clearly document the changes in the release notes and provide guidance on how users can customize the deny-list to suit their needs.
- File:
💡 Suggestions for Improvement
-
Policy Validation Framework
- File:
examples/policies/ - Suggestion: Implement a validation framework to ensure that YAML policy files conform to expected schemas and do not contain syntax errors or invalid configurations. Use Pydantic models for schema validation.
- File:
-
OWASP Agentic Top 10 Compliance
- File:
examples/policies/semantic-policy.yaml,examples/policies/conversation-guardian.yaml - Suggestion: Expand the policy configurations to address OWASP Agentic Top 10 risks comprehensively. For example, include rules for detecting unauthorized data aggregation (ASI-6) and feedback loop exploitation (ASI-10).
- File:
-
Backward Compatibility Layer
- File: Not explicitly shown in the diff.
- Suggestion: Provide a backward compatibility layer for deprecated APIs like
create_default_policies()to minimize disruption for existing users.
-
Documentation Enhancements
- File:
SECURITY.md,README.md - Suggestion: Include detailed examples of how to use the new
create_policies_from_config()API and YAML configurations. Highlight best practices for customizing policies.
- File:
-
Test Coverage for Configurable Policies
- File: Not explicitly shown in the diff.
- Suggestion: Add comprehensive test cases for the new YAML-based policy configurations to ensure correctness and prevent regressions.
Final Assessment
- Security: 🔴 Critical issues identified, including sandbox escape vectors and potential AES-GCM misuse.
- Backward Compatibility: 🟡 Breaking changes due to API deprecation and expanded deny-lists.
- Improvements: 💡 Suggestions provided for enhancing policy robustness, documentation, and test coverage.
Action Required: Address critical issues before merging. Ensure thorough testing and documentation updates to mitigate risks and ease migration for existing users.
🤖 AI Agent: security-scannerSecurity Review of PR: release: v2.2.0 — ESRP compliance, configurable security policies, hardening1. Prompt Injection Defense BypassSeverity: 🔴 CRITICAL Issue: Attack Vector: Recommendation:
2. Policy Engine CircumventionSeverity: 🟠 HIGH Issue: Attack Vector: Recommendation:
3. Trust Chain WeaknessesSeverity: 🔵 LOW Issue: Attack Vector: Recommendation:
4. Credential ExposureSeverity: 🟡 MEDIUM Issue: Attack Vector: Recommendation:
5. Sandbox EscapeSeverity: 🔴 CRITICAL Issue: Attack Vector: Recommendation:
6. Deserialization AttacksSeverity: 🟠 HIGH Issue: Attack Vector: Recommendation:
7. Race ConditionsSeverity: 🔵 LOW Issue: Attack Vector: Recommendation:
8. Supply ChainSeverity: 🟠 HIGH Issue: Attack Vector: Recommendation:
Summary of Findings
Final RecommendationThis PR introduces significant improvements to the toolkit's security posture, but it also introduces critical risks, particularly around prompt injection defenses, sandbox escapes, and deserialization. These issues must be addressed before merging to ensure the toolkit remains a robust security layer for downstream users. |
Release v2.2.0
Merges the release/v2.2.0 branch into main. Tagged as v2.2.0.
Security Fixes
Configurable Security Policies
ESRP Publishing Infrastructure
Package Metadata & Disclaimers