feat: add policy YAML validation command to CLI#665
feat: add policy YAML validation command to CLI#665imran-siddique merged 2 commits intomicrosoft:mainfrom
Conversation
|
Welcome to the Agent Governance Toolkit! Thanks for your first pull request. |
🤖 AI Agent: breaking-change-detector — Summary🔍 API Compatibility ReportSummaryThis pull request introduces new CLI commands for policy validation, testing, and comparison. It also enhances the existing Findings
Migration GuideNo migration is required as no breaking changes were introduced. Notes
✅ No breaking changes detected. |
🤖 AI Agent: docs-sync-checker — Issues Found📝 Documentation Sync ReportIssues Found
Suggestions
Additional Notes
Action Items
Once these issues are addressed, the documentation will be in sync with the code changes. |
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces a new CLI command, agentos policy validate, which validates YAML policy files against a JSON schema and performs Pydantic validation. It also includes new interactive Jupyter notebooks for demonstrating policy enforcement, MCP security, and multi-agent governance. While the PR introduces valuable features, there are some critical security concerns, potential breaking changes, and areas for improvement.
🔴 CRITICAL
-
Insufficient Validation of YAML Input
- Issue: The
_validate_yaml_with_line_numbers()function does not sanitize or validate the YAML input for potential malicious payloads. YAML files can include arbitrary Python objects, which could lead to code execution vulnerabilities. - Impact: This could allow an attacker to execute arbitrary code by injecting malicious payloads into the YAML file.
- Recommendation: Use
yaml.safe_load()instead ofyaml.load()to prevent the execution of arbitrary code. Ifsafe_load()is already being used, ensure that it is explicitly mentioned in the code or documentation.
- Issue: The
-
Potential Bypass of JSON Schema Validation
- Issue: The JSON schema validation is described as "best-effort," which implies that it might not catch all invalid inputs. This could lead to security vulnerabilities if the policy engine relies on the validity of the schema for enforcement.
- Impact: A malicious or malformed policy file could bypass validation and introduce vulnerabilities into the system.
- Recommendation: Ensure that the JSON schema is comprehensive and rigorously tested. Consider adding unit tests to validate edge cases and ensure the schema is robust.
-
Regex Injection in MCP Security Scanner
- Issue: The
scan_tool_definitionfunction uses user-provided input (description) directly in regex operations. If the regex patterns are not properly sanitized, this could lead to regex injection vulnerabilities. - Impact: An attacker could craft a malicious tool description to exploit the regex engine, potentially causing denial of service (ReDoS) or other unexpected behavior.
- Recommendation: Use a library like
re.escape()to sanitize user-provided input before using it in regex operations.
- Issue: The
-
Lack of Thread Safety in Circuit Breaker Implementation
- Issue: The
simulate_callandcheck_circuit_breakerfunctions modify sharedAgentMetricsobjects without any synchronization mechanisms. This could lead to race conditions in a multi-threaded environment. - Impact: Inconsistent or incorrect metrics could lead to incorrect circuit breaker behavior, potentially allowing agents to operate in unsafe conditions.
- Recommendation: Use thread-safe data structures or synchronization primitives (e.g.,
threading.Lock) to ensure thread safety when modifying shared state.
- Issue: The
🟡 WARNING
- Backward Compatibility
- Issue: The addition of the
agentos policysubcommand introduces new functionality to the CLI. While this is not a breaking change, it is important to ensure that existing commands and workflows are not affected. - Recommendation: Verify that the new subcommand does not interfere with existing CLI commands. Add tests to ensure backward compatibility.
- Issue: The addition of the
💡 SUGGESTIONS
-
Error Reporting
- Observation: The error reporting for the
agentos policy validatecommand is functional but could be improved for user experience. - Suggestion: Consider adding color-coded output (e.g., using the
coloramalibrary) to make errors and warnings more visually distinct. Additionally, provide suggestions for fixing common errors when possible.
- Observation: The error reporting for the
-
Documentation
- Observation: The new CLI commands and Jupyter notebooks are valuable additions, but the documentation does not appear to have been updated to reflect these changes.
- Suggestion: Update the README and any relevant documentation to include details about the new CLI commands and how to use the Jupyter notebooks.
-
Testing
- Observation: While the PR mentions that the
agentos policy validatecommand is "CI-friendly," there is no evidence of automated tests for this functionality. - Suggestion: Add unit tests and integration tests for the new CLI commands, especially for edge cases in YAML parsing, JSON schema validation, and Pydantic validation.
- Observation: While the PR mentions that the
-
Notebook Security
- Observation: The Jupyter notebooks include code that simulates potentially dangerous operations (e.g.,
exec('/bin/sh')in the MCP Security Proxy notebook). - Suggestion: Add clear warnings in the notebook headers to indicate that the code is for educational purposes only and should not be used in production environments without proper safeguards.
- Observation: The Jupyter notebooks include code that simulates potentially dangerous operations (e.g.,
-
Code Quality
- Observation: The
_validate_yaml_with_line_numbers()and_load_json_schema()helper functions are not documented. - Suggestion: Add docstrings to these functions to improve code readability and maintainability.
- Observation: The
-
Error Handling
- Observation: The
agentos policy validatecommand exits with a non-zero code on failure, but the specific exit codes are not documented. - Suggestion: Document the possible exit codes and their meanings in the CLI help text or documentation.
- Observation: The
Conclusion
This PR introduces useful features and enhancements to the agentos CLI and the Jupyter notebooks. However, there are critical security issues that must be addressed before merging. Additionally, improvements to documentation, testing, and error reporting would enhance the overall quality of the contribution.
🤖 AI Agent: test-generator — `packages/agent-os/src/agent_os/cli/__init__.py`🧪 Test Coverage Analysis
|
🤖 AI Agent: security-scanner — Security Review of `feat: add policy YAML validation command to CLI`Security Review of
|
| Finding | Severity | Recommendation |
|---|---|---|
| Prompt injection defense bypass | 🟠 HIGH | Sanitize and escape user-provided input to prevent prompt injection. |
| Policy engine circumvention | 🟠 HIGH | Secure the JSON schema file and verify its integrity at runtime. |
| Trust chain weaknesses | 🟡 MEDIUM | Pin jsonschema to a specific, secure version. |
| Credential exposure | 🟡 MEDIUM | Mask sensitive fields in logs and error messages. |
| Sandbox escape | 🔵 LOW | No immediate action required; yaml.safe_load is used. |
| Deserialization attacks | 🔵 LOW | Continue using yaml.safe_load. |
| Race conditions | 🔵 LOW | No immediate action required. |
| Supply chain risks | 🟠 HIGH | Pin dependencies to specific versions and use a dependency scanning tool. |
Overall Risk Assessment: 🟠 HIGH
This PR introduces useful functionality but also presents several security risks, particularly around prompt injection, policy engine circumvention, and supply chain vulnerabilities. Addressing these issues is critical to ensure the security of the toolkit and its downstream users.
|
@microsoft-github-policy-service agree |
059e587 to
415c4af
Compare
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Feedback on the Pull Request
🔴 CRITICAL Issues
-
Improper Handling of
jsonschemaImport Failure:- In
_validate_yaml_with_line_numbers, thejsonschemalibrary is imported inside the function, and if the import fails, the code silently skips JSON Schema validation. This could lead to undetected policy violations if the library is not installed. - Recommendation: Make
jsonschemaa required dependency in thesetup.pyorpyproject.tomlfile. If this is not feasible, explicitly log an error and exit with a non-zero status if the library is not installed.
- In
-
Insufficient Validation of Policy Schema:
- The
_validate_yaml_with_line_numbersfunction performs some basic checks (e.g., required fields, rule structure), but these checks are not comprehensive. For example:- The
rulesfield is only checked for being a list, but the individual rule objects are not fully validated against the schema. - The
actionfield is validated against a hardcoded list of valid actions, but this list may not stay in sync with the JSON Schema.
- The
- Recommendation: Use
jsonschemaexclusively for schema validation and avoid duplicating validation logic in_validate_yaml_with_line_numbers. This ensures consistency and reduces the risk of discrepancies between the schema and the code.
- The
-
Potential for Sandbox Escape in YAML Parsing:
- The
yaml.safe_loadfunction is used for parsing YAML files. While this is safer thanyaml.load, it is still vulnerable to certain attacks, such as those exploiting YAML tags. - Recommendation: Use a library like
ruamel.yamlor a stricter YAML parser to avoid potential sandbox escape vulnerabilities.
- The
-
Error Reporting for YAML Parsing:
- The error messages for YAML parsing issues are not sanitized. If an attacker provides a malicious YAML file, the error message could potentially leak sensitive information or be used for injection attacks.
- Recommendation: Sanitize error messages before displaying them to the user.
🟡 WARNING Issues
-
Backward Compatibility Risk:
- The addition of the
agentos policysubcommand introduces a new CLI structure. If users were relying on the existingvalidatecommand, this change could break their workflows. - Recommendation: Provide a deprecation warning for the old
validatecommand and ensure that it continues to work for a transitional period.
- The addition of the
-
Potential Breaking Changes in Policy Validation:
- The new validation logic introduces stricter checks (e.g., required fields, rule structure). Policies that previously passed validation may now fail.
- Recommendation: Clearly document these changes in the release notes and provide a migration guide for users to update their policies.
💡 Suggestions for Improvement
-
Improve Error Messages:
- The error messages in
_validate_yaml_with_line_numbersare functional but could be more user-friendly. For example, instead ofrules[2] -> action, consider using a more descriptive format likeRule 2: Invalid action field. - Recommendation: Use a consistent and user-friendly format for error messages.
- The error messages in
-
Add Unit Tests for CLI Commands:
- The new
agentos policysubcommands (validate,test,diff) should be thoroughly tested to ensure they work as expected. - Recommendation: Add unit tests for each subcommand, covering both success and failure scenarios.
- The new
-
Thread Safety in Policy Validation:
- The
_validate_yaml_with_line_numbersfunction and related logic do not appear to have any thread safety issues, but this should be explicitly verified if the CLI is expected to handle concurrent requests (e.g., in a CI/CD pipeline). - Recommendation: Add tests to verify thread safety in concurrent execution scenarios.
- The
-
Support for JSON Input:
- The
agentos policy validatecommand currently supports YAML files but does not explicitly mention support for JSON files. - Recommendation: Add explicit support for JSON files and update the documentation accordingly.
- The
-
Improve Documentation:
- The docstrings for the new functions and methods are detailed, but the overall documentation for the
agentos policysubcommands could be improved. - Recommendation: Add examples and usage instructions for the new subcommands in the CLI documentation.
- The docstrings for the new functions and methods are detailed, but the overall documentation for the
-
Use of
pathlibfor File Handling:- The code uses
Path.read_text()andPath.write_text()for file I/O, which is good. However, there are some instances whereopen()is still used (e.g., incmd_validate). - Recommendation: Use
pathlibconsistently for file handling.
- The code uses
-
Consider Adding a Dry-Run Mode:
- For the
agentos policy testandagentos policy diffsubcommands, a dry-run mode could be useful for users who want to preview the results without making changes. - Recommendation: Add a
--dry-runflag to these subcommands.
- For the
-
Logging Improvements:
- The CLI commands use
print()for output, which is fine for user-facing messages but not ideal for logging errors or debugging information. - Recommendation: Use the
loggingmodule for error and debug messages.
- The CLI commands use
Summary of Changes Needed
| Type | Description |
|---|---|
| 🔴 CRITICAL | Ensure jsonschema is a required dependency or handle its absence explicitly. |
| 🔴 CRITICAL | Use jsonschema exclusively for schema validation to avoid duplication. |
| 🔴 CRITICAL | Use a stricter YAML parser to prevent sandbox escape vulnerabilities. |
| 🔴 CRITICAL | Sanitize error messages to prevent information leakage or injection attacks. |
| 🟡 WARNING | Maintain backward compatibility for the validate command. |
| 🟡 WARNING | Document stricter validation rules and provide a migration guide. |
| 💡 SUGGESTION | Improve error messages, documentation, and add unit tests for new commands. |
| 💡 SUGGESTION | Add support for JSON input and a dry-run mode for certain subcommands. |
| 💡 SUGGESTION | Use pathlib consistently and improve logging practices. |
By addressing these issues and suggestions, the new feature will be more robust, secure, and user-friendly.
imran-siddique
left a comment
There was a problem hiding this comment.
Thanks for the contribution! Found 3 blocking issues:
- cmd_serve is undefined — NameError crashes the entire CLI (not just serve). Remove from commands dict until implemented.
- Metrics subparser deleted — agentos metrics is now broken. Restore the metrics subparser registration.
- Orphaned code on success path — cmd_validate falls through to old code referencing undefined all_valid. Add explicit return 0 and remove leftover results loop.
Also: --json output support was silently dropped (breaking for CI consumers), and --strict flag is referenced but never registered in argparse.
The YAML validation logic itself is well-done (yaml.safe_load, graceful jsonschema fallback). Just needs the wiring fixes above.
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This pull request introduces a new agentos policy validate <file> CLI command, along with related subcommands for testing and diffing policy files. The implementation includes YAML parsing, JSON Schema validation, and Pydantic-based structural validation. The PR also includes updates to the crewai_adapter.py and google_adk_adapter.py files to add docstrings and enhance policy enforcement mechanisms.
The changes are well-structured and provide a comprehensive validation pipeline for policy files. However, there are a few areas that require attention, particularly around security, backward compatibility, and potential improvements.
🔴 CRITICAL Issues
-
JSON Schema Validation Dependency Handling
- The
_validate_yaml_with_line_numbersfunction silently skips JSON Schema validation if thejsonschemalibrary is not installed. This could lead to incomplete validation and potential security bypasses. - Action: Explicitly check for the presence of
jsonschemaduring initialization or command execution and raise an error if it is missing. Alternatively, make it a required dependency.
- The
-
Error Handling in Policy Validation
- In the
cmd_validatefunction, exceptions during YAML parsing or validation are caught and logged but not re-raised. This could lead to silent failures in CI pipelines if the error is not properly propagated. - Action: Ensure that critical errors (e.g., YAML parsing issues) result in a non-zero exit code and are clearly communicated to the user.
- In the
-
Policy Violation Handling
- The
_default_violation_handleringoogle_adk_adapter.pyonly logs policy violations. This could allow violations to go unnoticed in production environments. - Action: Consider implementing stricter handling for policy violations, such as halting execution or triggering alerts, depending on the severity of the violation.
- The
🟡 WARNING: Potential Breaking Changes
-
CLI Argument Changes
- The addition of new CLI commands (
policy validate,policy test,policy diff) and changes to thevalidatecommand (e.g.,--strictflag) could potentially break existing workflows or scripts that rely on the old CLI behavior. - Action: Clearly document these changes in the release notes and provide migration guidance for users.
- The addition of new CLI commands (
-
Behavior of
validateCommand- The
validatecommand now defaults to validating all YAML files in the.agents/directory if no files are specified. This behavior change could lead to unexpected results for users who previously relied on the command's old behavior. - Action: Add a deprecation warning for the old behavior and provide a clear transition path.
- The
💡 Suggestions for Improvement
-
Error Reporting
- The error messages generated during validation could include more actionable details, such as suggestions for fixing the issue.
- Suggestion: Enhance error messages to include links to documentation or examples for resolving common issues.
-
Test Coverage
- Ensure that the new CLI commands and validation logic are thoroughly tested, including edge cases like empty files, invalid YAML, and unsupported schema versions.
- Suggestion: Add unit tests and integration tests for the new functionality.
-
Strict Mode Behavior
- The
--strictflag currently treats unknown top-level fields as warnings. Consider making this behavior configurable to allow users to decide whether unknown fields should be treated as errors. - Suggestion: Add a configuration option to control the behavior of strict mode.
- The
-
Performance Optimization
- The
cmd_validatefunction reads and parses each file multiple times (e.g., once for YAML parsing and again for validation). This could be optimized to reduce redundant I/O operations. - Suggestion: Parse the file once and reuse the parsed content for validation.
- The
-
Documentation
- The new CLI commands and their options should be documented in the project's README or user guide.
- Suggestion: Update the documentation to include examples and use cases for the new commands.
-
Backward Compatibility
- Consider adding a fallback mechanism for users who may not have the
policy_schema.jsonfile available. - Suggestion: Provide a default schema or a way to fetch it dynamically if it is missing.
- Consider adding a fallback mechanism for users who may not have the
Final Notes
The PR introduces valuable functionality for policy validation and governance, which is critical for maintaining security and compliance. Addressing the critical issues and warnings will ensure that the changes are robust and user-friendly. The suggestions provided can further enhance the usability and maintainability of the code.
a108d53 to
4acbfab
Compare
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review of feat: add policy YAML validation command to CLI
🔴 CRITICAL
-
Unvalidated Input in YAML Parsing:
- The
_validate_yaml_with_line_numbers()function directly processes thecontentdictionary parsed from the YAML file without verifying its structure beforehand. This could lead to unexpected behavior or security vulnerabilities if the input is maliciously crafted. - Recommendation: Add a pre-validation step to ensure that the
contentdictionary conforms to expected types and structure before proceeding with further validation. For example:if not isinstance(content, dict): raise ValueError(f"{filepath}: Top-level content must be a dictionary.")
- The
-
Potential Sandbox Escape via YAML Parsing:
- The
yaml.safe_load()function is used for parsing YAML files, which is generally safer thanyaml.load(). However, it is still susceptible to certain edge cases, such as maliciously crafted YAML files that could exploit the Python runtime. - Recommendation: Consider using a stricter YAML parser or explicitly validating the parsed content against a known schema before processing it further.
- The
-
JSON Schema Validation Error Handling:
- The
jsonschema.Draft7Validatoris used for schema validation, but there is no explicit handling for potential exceptions (e.g.,jsonschema.exceptions.SchemaErrororjsonschema.exceptions.ValidationError) that could occur during validation. - Recommendation: Add explicit exception handling for
jsonschemaerrors to ensure that the program does not crash unexpectedly.
- The
🟡 WARNING
- Backward Compatibility:
- The addition of the
agentos policysubcommand and its subcommands (validate,test,diff) introduces new CLI functionality. While this is a feature addition, it could potentially break existing workflows if users rely on the previous behavior of thevalidatecommand. - Recommendation: Clearly document the changes in the release notes and provide a deprecation warning for any old commands that are being replaced.
- The addition of the
💡 SUGGESTIONS
-
Error Messages for Users:
- The error messages generated by
_validate_yaml_with_line_numbers()are detailed, but they could be overwhelming for users. Consider categorizing errors and warnings more clearly in the output, or providing a summary of the issues at the end of the validation process.
- The error messages generated by
-
Strict Mode Behavior:
- In strict mode, unknown top-level fields are reported as warnings. However, the behavior of treating warnings as errors is not implemented. If strict mode is enabled, warnings should also result in a non-zero exit code.
- Recommendation: Modify the return logic in
cmd_validate()to account for warnings whenargs.strictisTrue.
-
Unit Tests:
- Ensure that there are comprehensive unit tests for the new functionality, including edge cases such as:
- Invalid YAML syntax
- Missing required fields
- Invalid field types
- Unknown fields in strict mode
- Valid policies
- Recommendation: Add tests for these scenarios if they are not already covered.
- Ensure that there are comprehensive unit tests for the new functionality, including edge cases such as:
-
Performance Considerations:
- The
_validate_yaml_with_line_numbers()function performs multiple passes over thecontentdictionary. For large policy files, this could impact performance. - Recommendation: Optimize the validation logic to minimize redundant iterations over the same data structure.
- The
-
Error Reporting Consistency:
- The error messages for JSON Schema validation and manual checks use different formats. For example:
- JSON Schema:
"{filepath}: [{location}] {ve.message}" - Manual checks:
"{filepath}: Missing required field: '{field}'"
- JSON Schema:
- Recommendation: Standardize the error message format for consistency.
- The error messages for JSON Schema validation and manual checks use different formats. For example:
-
Logging and Debugging:
- The current implementation uses
print()statements for output. While this is fine for CLI tools, it may be beneficial to integrate a logging framework (e.g., Python'sloggingmodule) for better debugging and log management. - Recommendation: Replace
print()with a logging framework and provide different log levels (e.g.,info,warning,error).
- The current implementation uses
-
Documentation:
- The new
agentos policysubcommands (validate,test,diff) should be documented in the project's README or CLI documentation. This will help users understand the new functionality and how to use it effectively.
- The new
-
Type Annotations:
- The
_validate_yaml_with_line_numbers()function uses type annotations, but thefilepathparameter is annotated asPath, while the function uses it as a string in error messages. This could lead to type-related issues. - Recommendation: Use
strinstead ofPathfor thefilepathparameter, or explicitly convertPathtostrwhen constructing error messages.
- The
-
Default Behavior for
agentos policy validate:- The
cmd_validate()function defaults to validating all YAML files in the.agents/directory if no files are specified. This behavior might be unexpected for users who are not aware of the default. - Recommendation: Prompt the user to confirm the default behavior or provide a clear warning message.
- The
-
Dependency Management:
- The
jsonschemalibrary is used for JSON Schema validation, but it is not clear if it is listed as a required dependency in therequirements.txtorpyproject.toml. - Recommendation: Ensure that
jsonschemais listed as a required dependency and document its purpose in the project.
- The
Summary
- 🔴 CRITICAL: Address unvalidated input, potential sandbox escape, and JSON Schema error handling.
- 🟡 WARNING: Ensure backward compatibility and document changes clearly.
- 💡 SUGGESTIONS: Improve error reporting, add unit tests, optimize performance, and enhance documentation.
Let me know if you'd like me to review specific parts of the code or provide additional recommendations!
|
Hi @imran-siddique — addressed all 3 blocking issues in the latest commit:
Also rebased against the latest main to resolve the adapter file conflicts. Ready for re-review! |
imran-siddique
left a comment
There was a problem hiding this comment.
All 3 blocking issues addressed. LGTM.
Closes #529
Summary
Added
agentos policy validate <file>CLI command that:jsonschema(best-effort)PolicyDocumentrules[2] -> action)New commands
agentos policy validate <file>— validate a single policy fileagentos policy test <policy> <scenarios>— run scenario testsagentos policy diff <file1> <file2>— compare two policiesChanges
packages/agent-os/src/agent_os/cli/__init__.py:cmd_validatewith JSON Schema + field location error reporting_validate_yaml_with_line_numbers()helper_load_json_schema()helpercmd_policy()dispatcheragentos policysubcommand in the CLI