Skip to content

feat: Add standardized --json output support across all CLI commands.#553

Closed
Deadpool2000 wants to merge 8 commits intomicrosoft:mainfrom
Deadpool2000:fix-issue-525
Closed

feat: Add standardized --json output support across all CLI commands.#553
Deadpool2000 wants to merge 8 commits intomicrosoft:mainfrom
Deadpool2000:fix-issue-525

Conversation

@Deadpool2000
Copy link
Copy Markdown
Contributor

@Deadpool2000 Deadpool2000 commented Mar 28, 2026

Description

This PR standardizes the CLI output across all major toolkit components by adding a consistent --json flag and implementing structured, machine-readable JSON output for all key commands.

Key improvements include:

  • A uniform --json flag for over 25+ commands across agentos, iatp, mcp-scan, acp-cli, and agentmesh.
  • Centralized output format detection for cleaner command handlers.
  • Structured JSON schemas for status checks, metrics, validation results, and security scans to facilitate seamless integration with CI/CD pipelines and external automation tools.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Maintenance (dependency updates, CI/CD, refactoring)
  • Security fix

Package(s) Affected

  • agent-os-kernel
  • agent-mesh
  • agent-runtime
  • agent-sre
  • agent-governance
  • docs / root

Checklist

  • My code follows the project style guidelines (ruff check)
  • I have added tests that prove my fix/feature works
  • All new and existing tests pass (pytest)
  • I have updated documentation as needed
  • I have signed the Microsoft CLA

@github-actions
Copy link
Copy Markdown

Welcome to the Agent Governance Toolkit! Thanks for your first pull request.
Please ensure tests pass, code follows style (ruff check), and you have signed the CLA.
See our Contributing Guide.

@github-actions github-actions bot added the agent-mesh agent-mesh package label Mar 28, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 28, 2026

🤖 AI Agent: breaking-change-detector — Summary

🔍 API Compatibility Report

Summary

This pull request introduces a standardized --json flag across CLI commands for structured JSON output. The changes are primarily additive, with no detected breaking changes to the existing API. The modifications enhance error handling and output formatting but do not alter existing function signatures or remove public APIs.

Findings

Severity Package Change Impact
🔵 agent-os-kernel Added --json flag to CLI commands New feature, no impact on existing functionality
🔵 agent-mesh Added --json flag to CLI commands New feature, no impact on existing functionality
🔵 agent-runtime Added --json flag to CLI commands New feature, no impact on existing functionality
🔵 agent-governance Enhanced error handling with JSON output New feature, no impact on existing functionality

Migration Guide

No migration steps are required as there are no breaking changes. Downstream users can optionally adopt the new --json flag for machine-readable output.

Additional Notes

  • The --json flag provides structured output for CLI commands, which is beneficial for automation and integration with CI/CD pipelines.
  • Enhanced error handling ensures sanitized JSON error messages, improving security and usability.
  • The changes are well-documented in the updated QUICKSTART.md and tutorials.

No breaking changes detected.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 28, 2026

🤖 AI Agent: docs-sync-checker — Issues Found

📝 Documentation Sync Report

Issues Found

  • cmd_verify(args: argparse.Namespace) -> int in packages/agent-compliance/src/agent_compliance/cli/main.py — missing docstring.
  • cmd_integrity(args: argparse.Namespace) -> int in packages/agent-compliance/src/agent_compliance/cli/main.py — missing docstring.
  • cmd_lint_policy(args: argparse.Namespace) -> int in packages/agent-compliance/src/agent_compliance/cli/main.py — missing docstring.
  • init(name: str, sponsor: str, output: str, output_json: bool) in packages/agent-mesh/src/agentmesh/cli/main.py — missing docstring.
  • register(agent_dir: str, name: str = None, output_json: bool = False) in packages/agent-mesh/src/agentmesh/cli/main.py — missing docstring.
  • status(agent_dir: str, output_json: bool) in packages/agent-mesh/src/agentmesh/cli/main.py — missing docstring.
  • policy(policy_file: str, validate: bool, output_json: bool) in packages/agent-mesh/src/agentmesh/cli/main.py — missing docstring.
  • audit(agent: str, limit: int, fmt: str, output_json: bool) in packages/agent-mesh/src/agentmesh/cli/main.py — missing docstring.
  • ⚠️ packages/agent-mesh/README.md — The new --json flag and its behavior are not documented.
  • ⚠️ CHANGELOG.md — The addition of the --json flag and standardized JSON output is not explicitly mentioned under the "New Features" section.

Suggestions

  • 💡 Add docstrings for the following functions:
    • cmd_verify(args: argparse.Namespace) -> int: Explain the purpose of the function, parameters, return values, and exceptions.
    • cmd_integrity(args: argparse.Namespace) -> int: Provide details on its functionality and behavior.
    • cmd_lint_policy(args: argparse.Namespace) -> int: Include information on what the function does and its parameters.
    • init(name: str, sponsor: str, output: str, output_json: bool): Describe the purpose of the function and the new output_json parameter.
    • register(agent_dir: str, name: str = None, output_json: bool = False): Add details about the parameters and the new output_json flag.
    • status(agent_dir: str, output_json: bool): Document the function's purpose and the output_json parameter.
    • policy(policy_file: str, validate: bool, output_json: bool): Explain the function and its parameters, especially the output_json flag.
    • audit(agent: str, limit: int, fmt: str, output_json: bool): Provide a description of the function and its parameters, including the output_json flag.
  • 💡 Update packages/agent-mesh/README.md to include documentation for the new --json flag and its behavior across commands.
  • 💡 Add an entry in the CHANGELOG.md under the "New Features" section to explicitly mention the addition of the --json flag and standardized JSON output for CLI commands.

Additional Notes

  • The QUICKSTART.md and docs/tutorials/04-audit-and-compliance.md have been updated to include details about the secure error handling and JSON output. This is a good step, but the main README.md for agent-mesh should also reflect these changes.
  • The new public APIs have type hints, which is good. However, they lack docstrings, which are essential for understanding their purpose and usage.
  • The example code in QUICKSTART.md and docs/tutorials/04-audit-and-compliance.md appears to be updated to reflect the new --json functionality.

Please address the issues and suggestions above to ensure documentation is fully in sync with the code changes.

@github-actions github-actions bot added the size/XL Extra large PR (500+ lines) label Mar 28, 2026
@github-actions
Copy link
Copy Markdown

🤖 AI Agent: contributor-guide — Welcome 🎉

Welcome 🎉

Hi @first-time-contributor! 👋

Welcome to the microsoft/agent-governance-toolkit community, and thank you for your contribution! We're thrilled to have you here. Your effort to enhance the CLI with standardized --json output is a fantastic addition that will undoubtedly improve the developer experience and enable better automation workflows. Great job on tackling such an impactful feature!


What You Did Well ✅

  1. Comprehensive Implementation: You've done an excellent job of implementing the --json flag across multiple commands and ensuring consistency in the output format. This will make it much easier for users to integrate the toolkit into their CI/CD pipelines.

  2. Thoughtful Design: The use of structured JSON schemas for different outputs (e.g., status checks, metrics, validation results) is a thoughtful touch. It shows that you've considered how users will consume this data programmatically.

  3. Backward Compatibility: By introducing the --json flag as an optional parameter, you've ensured that existing workflows relying on text-based output won't break. This is a great example of maintaining backward compatibility.

  4. Code Quality: Your code is clean, and the use of conditional checks for output_json is well-structured and easy to follow.


Suggestions for Improvement ✨

While your PR is already in great shape, here are a few suggestions to align it more closely with the project's conventions and ensure the highest quality:

1. Testing

  • It's great to see that you've added tests for your changes! However, I noticed that the tests are not included in the packages/{name}/tests/ directory, which is the convention for this project. Could you move your tests to the appropriate directory for each package? This will help maintain consistency and make it easier for others to find and run the tests.
  • Additionally, consider adding test cases for edge scenarios, such as:
    • Running commands without the --json flag to ensure text-based output remains unaffected.
    • Testing invalid inputs or error scenarios with the --json flag to verify that error messages are correctly formatted in JSON.

2. Linting

  • The project uses ruff for linting with specific rules (E, F, W). Please run ruff check locally or as part of your CI pipeline to ensure your code adheres to the project's style guidelines.

3. Documentation

  • Since this is a new feature, it would be helpful to update the relevant documentation to inform users about the --json flag. Specifically:
    • Add details about the --json flag to the CLI documentation in the repository.
    • Update any examples in QUICKSTART.md to showcase the new functionality.
    • If applicable, mention the JSON schemas in CONTRIBUTING.md or other relevant documentation.

4. Commit Message

  • Thank you for using the feat: prefix in your commit message! This aligns with the project's use of Conventional Commits. However, consider adding more detail to the body of your commit message to explain the "why" behind your changes. For example:
    feat: Add standardized --json output support across all CLI commands
    
    - Introduced a consistent --json flag for 25+ commands across agentos, iatp, mcp-scan, acp-cli, and agentmesh.
    - Added structured JSON schemas for status checks, metrics, validation results, and security scans.
    - Improved output format detection for cleaner command handlers.
    - Enables seamless integration with CI/CD pipelines and automation tools.
    

5. Security Considerations

  • Since this PR involves changes to CLI commands that may interact with sensitive data (e.g., agent identities, policies, and audit logs), it's important to ensure that no sensitive information is inadvertently exposed in the JSON output. For example:
    • Are there any fields in the JSON output that could contain sensitive data (e.g., private keys, secrets)?
    • If so, consider masking or excluding those fields from the output.

Resources 📚

Here are some resources to help you with the suggestions above:


Next Steps 🚀

  1. Address the suggestions above:

    • Move your tests to the appropriate packages/{name}/tests/ directory.
    • Run ruff check locally and fix any linting issues.
    • Update the documentation to reflect the new --json functionality.
    • Review the JSON output for any potential security concerns.
  2. Once you've made the updates, push your changes to this branch. GitHub Actions will automatically re-run the CI pipeline.

  3. After you've addressed the feedback, a maintainer will review your changes again and provide further guidance if needed.


Thank you again for your contribution! If you have any questions or need help with anything, feel free to ask. We're here to support you. 😊

Looking forward to your updates! 🚀

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 28, 2026

🤖 AI Agent: test-generator — `packages/agent-compliance/src/agent_compliance/cli/main.py`

🧪 Test Coverage Analysis

packages/agent-compliance/src/agent_compliance/cli/main.py

  • Existing coverage: Basic functionality of cmd_verify, cmd_integrity, and cmd_lint_policy commands is likely covered by existing tests. However, the new --json flag and its associated error handling are not explicitly mentioned in the test coverage.
  • Missing coverage:
    • Error handling for --json flag in cmd_verify, cmd_integrity, and cmd_lint_policy.
    • Scenarios where exceptions are raised and the JSON error output is triggered.
  • 💡 Suggested test cases:
    1. test_cmd_verify_json_success — Test the cmd_verify function with the --json flag and verify that the JSON output is correctly formatted and contains the expected fields.
    2. test_cmd_verify_json_error_handling — Simulate an exception in cmd_verify and verify that the JSON error output is correctly formatted and sanitized.
    3. test_cmd_integrity_json_error_handling — Simulate an exception in cmd_integrity and verify that the JSON error output is correctly formatted and sanitized.
    4. test_cmd_lint_policy_json_error_handling — Simulate an exception in cmd_lint_policy and verify that the JSON error output is correctly formatted and sanitized.

packages/agent-mesh/src/agentmesh/cli/main.py

  • Existing coverage: Core functionality of commands like init, register, status, policy, and audit is likely covered. However, the --json flag and its associated error handling are new and likely not covered.
  • Missing coverage:
    • JSON output for init, register, status, policy, and audit commands.
    • Error handling for invalid inputs (e.g., malformed agent identifiers in audit).
    • Edge cases for trust scoring (e.g., edge scores like 0.0 and 1.0, revoked trust scenarios).
    • Validation of policy files (e.g., malformed YAML/JSON, missing required fields).
    • Concurrency issues in shared state (e.g., multiple simultaneous register commands).
  • 💡 Suggested test cases:
    1. test_init_json_output — Test the init command with the --json flag and verify the structured JSON output.
    2. test_register_json_error_handling — Simulate various errors (e.g., missing agentmesh.yaml, invalid agent name) in the register command and verify the JSON error output.
    3. test_status_json_output — Test the status command with the --json flag and verify the structured JSON output, including trust scores and dimensions.
    4. test_policy_json_error_handling — Test the policy command with malformed YAML/JSON and verify the JSON error output.
    5. test_audit_json_output — Test the audit command with the --json flag and verify the sanitized JSON output.
    6. test_audit_invalid_agent_identifier — Test the audit command with an invalid agent identifier and verify that the appropriate error is raised and sanitized in JSON output.
    7. test_trust_score_edge_cases — Simulate edge cases for trust scores (e.g., 0.0, 1.0, revoked trust) and verify the output in both table and JSON formats.
    8. test_register_concurrent_execution — Simulate multiple simultaneous register commands to test for race conditions or file access issues.

packages/agent-os/src/agent_os/cli/__init__.py

  • Existing coverage: This file likely serves as an initializer for the CLI module. No functional changes are evident in the diff.
  • Missing coverage: None specific to this file.
  • 💡 Suggested test cases: None required for this file.

packages/agent-os/src/agent_os/cli/mcp_scan.py

  • Existing coverage: Not directly modified in this PR, so no new coverage is required.
  • Missing coverage: None specific to this PR.
  • 💡 Suggested test cases: None required for this file.

packages/agent-os/src/agent_os/policies/cli.py

  • Existing coverage: This file is not directly modified in this PR, so no new coverage is required.
  • Missing coverage: None specific to this PR.
  • 💡 Suggested test cases: None required for this file.

Summary

The primary focus for new test cases should be on the --json flag and its associated error handling in the CLI commands. Additionally, edge cases for trust scoring, policy validation, and input validation (e.g., malformed agent identifiers) should be tested to ensure robustness. The agent-compliance and agent-mesh modules require the most attention for new test cases.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 28, 2026

🤖 AI Agent: security-scanner — Security Analysis of the Pull Request

Security Analysis of the Pull Request

This PR introduces standardized --json output support across CLI commands, which is a significant improvement for structured data handling. However, given the critical nature of this repository, the changes must be scrutinized for potential security vulnerabilities. Below is the analysis based on the specified security concerns:


1. Prompt Injection Defense Bypass

  • Analysis: No direct prompt injection vulnerabilities were identified. However, the --json flag introduces new input/output pathways that could potentially be exploited if not properly sanitized.
  • Rating: 🔵 LOW
  • Recommendation: Ensure all user-provided inputs (e.g., agent identifiers in the audit command) are strictly validated and sanitized. This is partially addressed with regex validation for agent identifiers, but further testing is recommended.

2. Policy Engine Circumvention

  • Analysis: The policy command loads policies from user-provided files. While exceptions are caught and errors are logged, there is no explicit validation of policy integrity (e.g., cryptographic signatures) before loading.
  • Rating: 🟠 HIGH
  • Attack Vector: An attacker could provide a malicious policy file that weakens or circumvents governance rules.
  • Recommendation: Implement cryptographic validation (e.g., signatures or checksums) for policy files to ensure they have not been tampered with. Reject unsigned or invalid files.

3. Trust Chain Weaknesses

  • Analysis: The register command interacts with the AgentMesh CA for identity registration. However, there is no verification of the CA's authenticity or the integrity of the identity data.
  • Rating: 🔴 CRITICAL
  • Attack Vector: A compromised or spoofed CA could issue invalid identities, undermining the trust model.
  • Recommendation: Implement strict certificate validation for the CA. Use SPIFFE/SVID or similar mechanisms to ensure the authenticity of the CA and its issued identities.

4. Credential Exposure

  • Analysis: The register command saves identity data (e.g., DID, public key) to a file. While this is expected behavior, there is no mention of encryption or secure storage.
  • Rating: 🟠 HIGH
  • Attack Vector: If the .agentmesh/identity.json file is accessed by unauthorized users, it could expose sensitive identity information.
  • Recommendation: Encrypt the identity file at rest and restrict file permissions to the owner only. Provide warnings if the file is stored in an insecure location.

5. Sandbox Escape

  • Analysis: No evidence of sandboxing mechanisms was found in the CLI commands. While this PR does not introduce direct sandboxing issues, the lack of isolation for commands like policy and audit could be exploited.
  • Rating: 🟡 MEDIUM
  • Attack Vector: Malicious policy files or audit logs could execute arbitrary code if loaded without proper isolation.
  • Recommendation: Execute untrusted operations (e.g., policy loading) in a restricted environment (e.g., a container or a separate process with limited permissions).

6. Deserialization Attacks

  • Analysis: The policy and audit commands parse JSON and YAML files. While exceptions are caught, there is no indication of safe deserialization practices.
  • Rating: 🔴 CRITICAL
  • Attack Vector: Maliciously crafted JSON or YAML files could exploit deserialization vulnerabilities to execute arbitrary code.
  • Recommendation: Use safe deserialization libraries (e.g., json.loads for JSON and yaml.safe_load for YAML). Avoid using yaml.load without specifying a loader.

7. Race Conditions

  • Analysis: No explicit race conditions were identified in the PR. However, concurrent access to shared resources (e.g., identity files) could lead to TOCTOU (Time of Check to Time of Use) vulnerabilities.
  • Rating: 🟡 MEDIUM
  • Recommendation: Implement file locking mechanisms when accessing or modifying shared resources.

8. Supply Chain

  • Analysis: The PR introduces new imports (e.g., json, yaml, re, pathlib). These are standard libraries, so there is no risk of dependency confusion or typosquatting.
  • Rating: 🔵 LOW
  • Recommendation: Regularly audit third-party dependencies for vulnerabilities and ensure they are sourced from trusted repositories.

Additional Observations

  1. Error Handling:

    • The PR introduces sanitized JSON error messages to prevent information disclosure. This is a good practice, but it is essential to ensure that no sensitive information is inadvertently included in the sanitized output.
    • Consider adding unit tests to verify that error messages do not leak sensitive information.
  2. Logging:

    • The PR does not explicitly address logging practices. Ensure that logs do not contain sensitive information (e.g., credentials, private keys).
  3. Testing:

    • While the PR mentions that tests have been added, it is unclear if these tests cover edge cases, such as malformed JSON/YAML files or invalid inputs. Comprehensive test coverage is critical for a security-focused toolkit.

Summary of Findings

Finding Rating Recommendation
Prompt injection defense bypass 🔵 LOW Ensure all user inputs are validated and sanitized.
Policy engine circumvention 🟠 HIGH Add cryptographic validation for policy files.
Trust chain weaknesses 🔴 CRITICAL Implement strict CA certificate validation using SPIFFE/SVID or similar.
Credential exposure 🟠 HIGH Encrypt identity files and restrict file permissions.
Sandbox escape 🟡 MEDIUM Execute untrusted operations in a restricted environment.
Deserialization attacks 🔴 CRITICAL Use safe deserialization methods for JSON/YAML.
Race conditions 🟡 MEDIUM Implement file locking for shared resources.
Supply chain 🔵 LOW Regularly audit dependencies for vulnerabilities.

Final Recommendation

This PR introduces useful functionality but has several critical and high-severity security concerns that must be addressed before merging. Specifically:

  1. Implement cryptographic validation for policy files.
  2. Enforce strict CA certificate validation for identity registration.
  3. Encrypt sensitive files and restrict their access.
  4. Use safe deserialization methods for JSON and YAML.
  5. Add unit tests to validate error sanitization and input validation.

After addressing these issues, the PR can be re-evaluated for merging.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This PR introduces a standardized --json output flag across all CLI commands in the repository. This is a valuable addition for enabling machine-readable outputs, which is especially useful for integration with CI/CD pipelines and automation tools. The implementation appears to be well-structured, with JSON output logic added to multiple commands across different packages. However, there are some areas of concern, particularly around error handling, type safety, and potential security implications.


🔴 CRITICAL

  1. Insufficient Error Handling in JSON Output:

    • In several places, exceptions are caught and directly converted into JSON output without sanitization or proper classification. For example:
      except Exception as e:
          if output_json:
              print(json.dumps({"status": "error", "message": str(e)}, indent=2))
          else:
              console.print(f"[red]Error: {e}[/red]")
      • Risk: This could expose sensitive internal details (e.g., stack traces, file paths) to the user or external systems consuming the JSON output.
      • Action: Ensure that error messages are sanitized and do not leak sensitive information. Use specific exception handling where possible, and provide generic error messages for unexpected exceptions.
  2. Potential Information Disclosure in JSON Output:

    • The JSON output for the register command includes the public_key field:
      print(json.dumps({
          "status": "success",
          "agent_name": agent_name,
          "did": identity.did,
          "public_key": identity.public_key,
          "identity_file": str(identity_file)
      }, indent=2))
      • Risk: Exposing the public key in the output could be a security risk if the JSON output is logged or shared inadvertently.
      • Action: Evaluate whether exposing the public key in the output is necessary. If not, remove it. If it is necessary, ensure that the output is only accessible to authorized users.
  3. Sandbox Escape Risk in audit Command:

    • The audit command allows users to specify --json output, but there is no validation or sanitization of the audit log entries before they are serialized into JSON.
      • Risk: If the audit logs contain untrusted or malicious data, this could lead to injection attacks or other vulnerabilities when the JSON output is consumed.
      • Action: Ensure that all data included in the JSON output is sanitized and validated to prevent potential injection attacks.

🟡 WARNING

  1. Backward Compatibility:

    • Adding a --json flag to existing commands is a non-breaking change, but the behavior of commands may differ when the flag is used. For example, the register command now outputs JSON instead of rich console output when --json is specified.
      • Risk: Users relying on the previous behavior (e.g., parsing the console output) may experience issues.
      • Action: Clearly document the new behavior in the CLI documentation and release notes. Consider adding a deprecation warning for users relying on the old behavior.
  2. Public API Changes:

    • The addition of the --json flag to multiple commands changes the signature of these commands. While this is not a breaking change, it may impact users who have scripts or tools that invoke these commands programmatically.
      • Action: Highlight this change in the release notes and provide guidance on how to update existing scripts.

💡 SUGGESTIONS

  1. Centralize JSON Output Logic:

    • The JSON output logic is repeated across multiple commands, leading to potential inconsistencies and increased maintenance overhead.
      • Improvement: Create a utility function (e.g., output_json(data: dict, output_json: bool)) to handle JSON serialization and printing. This would reduce duplication and ensure consistent formatting.
  2. Type Safety and Validation:

    • The JSON output relies on implicit assumptions about the structure of data (e.g., manifest['agent']['name']).
      • Improvement: Use Pydantic models to validate and serialize data before outputting it as JSON. This would ensure type safety and prevent runtime errors due to missing or malformed data.
  3. Test Coverage:

    • While the PR mentions that tests have been added, it is unclear whether all edge cases (e.g., missing files, invalid inputs, exceptions) are covered.
      • Improvement: Add comprehensive tests for the --json flag, including scenarios where commands fail or encounter unexpected conditions.
  4. CLI Documentation:

    • The PR does not include updates to the CLI documentation to reflect the new --json flag.
      • Improvement: Update the CLI documentation to include examples of using the --json flag and explain the structure of the JSON output for each command.
  5. Logging for Debugging:

    • The --json flag suppresses console output, which may make debugging more difficult.
      • Improvement: Add optional logging for debugging purposes, which can be enabled via an environment variable or a --verbose flag.

Summary of Actions

  • 🔴 Sanitize error messages and avoid exposing sensitive information in JSON output.
  • 🔴 Validate and sanitize all data included in JSON output to prevent injection attacks.
  • 🟡 Document the changes to CLI behavior and public API in the release notes.
  • 💡 Centralize JSON output logic into a utility function.
  • 💡 Use Pydantic models for type safety and validation of JSON output.
  • 💡 Add comprehensive tests for the --json flag, including edge cases.
  • 💡 Update CLI documentation to reflect the new --json flag.
  • 💡 Consider adding optional logging for debugging purposes.

This PR is a significant improvement for the CLI's usability and integration capabilities, but addressing the critical issues and implementing the suggested improvements will ensure a more robust and secure implementation.

@Deadpool2000
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review for feat: Add standardized --json output support across all CLI commands


🔴 CRITICAL

  1. Potential Information Disclosure in JSON Output:

    • Issue: The JSON output for the audit command includes sensitive fields like agent and action. While you attempted to sanitize the output by allowing only specific keys, there is no guarantee that sensitive data won't inadvertently be included in the future.
    • Recommendation: Implement a stricter validation mechanism for the JSON output. Consider using a schema validation library like Pydantic to define and enforce the structure of the JSON output. This will ensure that only explicitly allowed fields are included in the output.
  2. Improper Error Handling in register Command:

    • Issue: The register command catches a generic Exception and outputs the error message directly in the JSON response. This could lead to information disclosure if sensitive details (e.g., stack traces or internal paths) are included in the exception message.
    • Recommendation: Avoid catching generic exceptions. Instead, catch specific exceptions (e.g., FileNotFoundError, KeyError) and provide sanitized error messages for JSON output.
  3. Improper Handling of Missing Dependencies:

    • Issue: In the register command, the ImportError for agentmesh.identity is caught, but the error message is printed directly in the JSON output. This could leak internal implementation details.
    • Recommendation: Provide a generic error message for missing dependencies in the JSON output, such as "Required dependency is missing. Please install the necessary packages.".
  4. Unvalidated Input in audit Command:

    • Issue: The audit command does not validate the agent argument before using it to filter entries. This could lead to potential injection attacks or unexpected behavior.
    • Recommendation: Validate the agent argument to ensure it conforms to the expected format (e.g., a valid DID or identifier). Use a library like Pydantic for input validation.

🟡 WARNING

  1. Backward Compatibility:

    • Issue: Adding the --json flag to existing commands changes their behavior. Scripts or tools that rely on the current CLI output format may break if they do not account for the new flag.
    • Recommendation: Clearly document the new --json flag in the release notes and provide examples of its usage. Consider adding a deprecation warning for a release cycle before making the change mandatory.
  2. Potential Impact on Third-Party Integrations:

    • Issue: The introduction of structured JSON output may impact third-party tools or scripts that parse the CLI output in its current format.
    • Recommendation: Include a migration guide in the documentation to help users adapt their tools to the new JSON output format.

💡 SUGGESTIONS

  1. Centralize JSON Output Logic:

    • Observation: The JSON output logic is duplicated across multiple commands. This increases the risk of inconsistencies and makes future maintenance harder.
    • Recommendation: Create a utility function or class to handle JSON output formatting and error handling. This will ensure consistency and reduce code duplication.
    def output_result(data: dict, output_json: bool):
        if output_json:
            print(json.dumps(data, indent=2))
        else:
            # Implement a fallback for non-JSON output if needed
            pass
  2. Add Tests for JSON Output:

    • Observation: While the PR mentions that tests have been added, there is no evidence in the diff of specific tests for the new --json functionality.
    • Recommendation: Add unit tests to verify the correctness of the JSON output for all commands. Use a library like pytest and mock the print or click.echo functions to capture and validate the output.
  3. Use Pydantic for JSON Schema Validation:

    • Observation: The JSON output is manually constructed in multiple places, which increases the risk of inconsistencies and errors.
    • Recommendation: Use Pydantic models to define and validate the structure of the JSON output. This will ensure type safety and make the code more maintainable.

    Example:

    from pydantic import BaseModel
    from typing import List, Optional
    
    class PolicyOutput(BaseModel):
        status: str
        policies: List[dict]
    
    def output_policies(policies, output_json: bool):
        data = PolicyOutput(status="success", policies=policies)
        if output_json:
            print(data.json(indent=2))
        else:
            # Handle non-JSON output
            pass
  4. Improve Logging for Debugging:

    • Observation: The PR introduces new functionality but does not include sufficient logging for debugging purposes.
    • Recommendation: Add debug-level logging to capture key events, such as when JSON output is generated or when errors occur. This will help with troubleshooting and monitoring.
  5. Update Documentation:

    • Observation: The PR does not include updates to the documentation for the new --json flag.
    • Recommendation: Update the CLI documentation to include details about the --json flag for all commands. Provide examples of both JSON and non-JSON outputs.
  6. Consider Using click's Context Object:

    • Observation: The --json flag is added to multiple commands, leading to repetitive code.
    • Recommendation: Use click's Context object to define the --json flag globally for all commands. This will reduce redundancy and ensure consistent behavior.

    Example:

    @click.group()
    @click.option("--json", "output_json", is_flag=True, help="Output in JSON format")
    @click.pass_context
    def app(ctx, output_json):
        ctx.ensure_object(dict)
        ctx.obj['output_json'] = output_json
  7. Sanitize JSON Output:

    • Observation: While some effort has been made to sanitize JSON output (e.g., in the audit command), this is not consistent across all commands.
    • Recommendation: Implement a centralized sanitization function to ensure that sensitive or unnecessary data is not included in the JSON output.

Summary

This PR introduces a valuable feature by standardizing JSON output across CLI commands, which will improve integration with automation tools. However, there are critical concerns around security, backward compatibility, and code maintainability that need to be addressed before merging. By implementing the recommendations above, you can ensure the robustness, security, and usability of the new feature.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: feat: Add standardized --json output support across all CLI commands


🔴 CRITICAL Issues

  1. Improper Error Handling in register Command:

    • The register command catches general exceptions (except Exception) but does not log or re-raise them. This could lead to silent failures and make debugging difficult.
    • Action: Log the exception details (e.g., logger.exception) and re-raise or handle it appropriately to ensure visibility into unexpected errors.
  2. Potential Information Disclosure in audit Command:

    • The audit command does not sanitize the agent parameter before filtering entries. Although a regex is used to validate the format, there is no guarantee that the regex will cover all edge cases.
    • Action: Use stricter validation or sanitize the input further to prevent potential injection attacks.
  3. Missing Validation for JSON Output:

    • The JSON output in multiple commands (e.g., status, policy, audit) does not validate the structure of the data being serialized. This could lead to unexpected behavior or security vulnerabilities if the data contains unexpected or malicious content.
    • Action: Use a schema validation library (e.g., Pydantic) to validate the structure and types of the data before serializing it to JSON.
  4. Insecure Handling of Sensitive Data:

    • The register command writes sensitive identity data (e.g., identity.did, identity_file) to a file without encryption or access control.
    • Action: Encrypt sensitive data before writing it to disk and ensure that file permissions are restricted to authorized users only.

🟡 WARNING Issues

  1. Breaking Changes in CLI Commands:

    • Adding the --json flag to existing commands may break backward compatibility for scripts or tools that rely on the previous output format.
    • Action: Clearly document the changes in the release notes and provide migration guidance for users.
  2. Changes to init_integration Command:

    • The _init_claude_integration function has been removed, which could break workflows relying on this functionality.
    • Action: Confirm whether this is an intentional breaking change and document it in the release notes.

💡 Suggestions for Improvement

  1. Centralized JSON Output Handling:

    • The JSON output logic is repeated across multiple commands. This can lead to inconsistencies and maintenance challenges.
    • Suggestion: Create a utility function (e.g., output_json(data: dict, error: bool = False)) to standardize JSON output and error handling across commands.
  2. Schema Validation for JSON Output:

    • Use Pydantic models to define and validate the structure of JSON output for each command. This ensures type safety and prevents accidental inclusion of invalid or sensitive data.
    • Example:
      from pydantic import BaseModel
      
      class StatusOutput(BaseModel):
          agent_name: Optional[str]
          did: Optional[str]
          trust_score: int
          max_score: int
          dimensions: dict
  3. Thread Safety in Concurrent Execution:

    • The register command uses the AgentIdentity.create method, but there is no indication of thread safety in its implementation. If this method is called concurrently, it could lead to race conditions.
    • Suggestion: Review the implementation of AgentIdentity.create and ensure it is thread-safe.
  4. Improved Logging:

    • Add logging for key events (e.g., successful command execution, errors) to improve observability and debugging.
    • Suggestion: Use logger.info, logger.warning, and logger.error consistently across all commands.
  5. Enhanced CLI Help Messages:

    • Some CLI commands (e.g., policy, audit) lack detailed help messages for the new --json flag.
    • Suggestion: Update the help messages to explain the purpose and usage of the --json flag.
  6. Test Coverage:

    • Ensure that the new --json functionality is thoroughly tested for all commands, including edge cases (e.g., invalid input, missing files).
    • Suggestion: Add unit tests and integration tests to verify the correctness of JSON output and error handling.
  7. Documentation Update:

    • The PR checklist indicates that documentation has not been updated. This is critical for users to understand the new --json functionality.
    • Suggestion: Update the CLI documentation to include examples of using the --json flag for each command.
  8. Use of try-except Blocks:

    • Some try-except blocks (e.g., in the policy command) are too broad and may catch unintended exceptions.
    • Suggestion: Narrow the scope of try-except blocks to handle only specific exceptions.
  9. Default JSON Output:

    • The audit command uses a --format flag with a default value of table. This could lead to confusion when combined with the new --json flag.
    • Suggestion: Deprecate the --format flag and use the --json flag consistently across all commands.
  10. Error Messages for Missing Dependencies:

    • The register command raises an error if the agentmesh.identity module is not installed, but the error message does not provide clear installation instructions.
    • Suggestion: Include a command (e.g., pip install agentmesh-platform) in the error message to guide users.

Summary

This PR introduces a valuable feature by standardizing JSON output across CLI commands. However, there are critical security issues related to error handling, input validation, and sensitive data handling that must be addressed before merging. Additionally, there are potential breaking changes that should be documented, and several opportunities for improvement in code structure, logging, and testing.

Recommended Actions Before Merge:

  1. Address the critical issues related to error handling, input validation, and sensitive data handling.
  2. Confirm and document any breaking changes.
  3. Add tests for the new functionality.
  4. Update the documentation to reflect the changes.

Let me know if you need further clarification or assistance!

Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Deadpool2000 — standardized JSON CLI output is valuable. A few concerns before merge:

  1. -1971 deletions — this removes a lot of code. Please confirm no existing CLI behavior is broken (existing non-JSON output should still work as before).

  2. CLA — checklist shows CLA not signed. Please sign the Microsoft CLA.

  3. Documentation — checklist shows docs not updated. The CLI tutorials should be updated if command output formats changed.

Please address these and I'll do a full diff review.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Code Review for PR: feat: Add standardized --json output support across all CLI commands


🔴 CRITICAL

  1. Improper Exception Handling in CLI Commands:

    • In multiple places, the code uses a generic except Exception block to catch all exceptions. This is a bad practice as it can mask critical issues, including security vulnerabilities.
      • Example:
        except Exception as e:
            if args.json:
                import json
                print(json.dumps({"status": "fail", "error": "Governance verification failed", "type": "InternalError"}, indent=2))
            else:
                print(f"Error: {e}", file=sys.stderr)
            return 1
      • Recommendation: Replace except Exception with specific exception types. For example:
        except (ValueError, KeyError) as e:
            # Handle specific errors
        Use a generic except Exception block only for logging unexpected errors and re-raise the exception after logging.
  2. Dynamic Imports in Error Handling:

    • In several places, the code dynamically imports modules (e.g., import json) inside exception handling blocks. This can lead to unexpected behavior if the import fails or if the environment is compromised.
      • Example:
        if args.json:
            import json
            print(json.dumps({"status": "fail", "error": "Governance verification failed", "type": "InternalError"}, indent=2))
      • Recommendation: Move all imports to the top of the file to ensure they are loaded during initialization and to avoid runtime surprises.
  3. Potential Information Disclosure in Error Messages:

    • The error messages in JSON output may expose sensitive information, such as stack traces or internal error details, which could be exploited by attackers.
      • Example:
        except Exception as e:
            print(json.dumps({"status": "error", "message": str(e), "type": e.__class__.__name__}, indent=2))
      • Recommendation: Avoid exposing internal error details in JSON output. Use generic error messages for external-facing outputs and log detailed errors for internal debugging.
  4. Unvalidated User Input in audit Command:

    • The audit command accepts an agent parameter, but the validation regex used is overly permissive and could allow injection attacks.
      • Example:
        if not re.match(r"^agent-[a-zA-Z0-9_-]+$|^did:agentmesh:[a-zA-Z0-9._-]+$", agent):
      • Recommendation: Use stricter validation rules for agent identifiers. For example:
        if not re.match(r"^agent-[a-zA-Z0-9]+$|^did:agentmesh:[a-zA-Z0-9]+$", agent):
  5. Missing Input Validation for JSON Output:

    • The JSON output generation does not validate the structure or content of the data being serialized. This could lead to invalid or malformed JSON being produced.
      • Example:
        print(json.dumps({
            "agent_name": manifest['agent']['name'] if manifest else None,
            "did": identity['did'] if identity else None,
            "trust_score": 820,
            "max_score": 1000,
            "dimensions": {
                "policy_compliance": 85,
                "resource_efficiency": 72,
                "output_quality": 91,
                "security_posture": 88,
                "collaboration_health": 79
            }
        }, indent=2))
      • Recommendation: Use Pydantic models to validate and serialize JSON output. This ensures type safety and prevents invalid data from being output.

🟡 WARNING

  1. Backward Compatibility Risk:

    • Adding a --json flag to existing commands changes the CLI interface. While this is not a breaking change for users who do not use the --json flag, it may affect users who rely on the exact format of the existing output.
      • Recommendation: Clearly document the new --json flag in the release notes and update any relevant documentation. Consider adding a deprecation warning for any old behavior that may be removed in the future.
  2. Silent Failure in try/except Blocks:

    • In some cases, the code silently fails without providing feedback to the user.
      • Example:
        except:
            pass
      • Recommendation: Avoid silent failures. Always log or provide feedback to the user, even if it's just a generic error message.
  3. Potential Breaking Changes in CLI Commands:

    • The addition of the --json flag and changes to the output format may break existing scripts or integrations that rely on the previous output format.
      • Recommendation: Ensure backward compatibility by maintaining the existing output format as the default and requiring users to explicitly opt-in to the new --json format.

💡 SUGGESTIONS

  1. Centralize JSON Output Handling:

    • The JSON output logic is repeated across multiple commands, leading to code duplication and potential inconsistencies.
      • Recommendation: Create a utility function or class to handle JSON output. For example:
        def output_json(data: dict, error: bool = False):
            status = "error" if error else "success"
            data["status"] = status
            print(json.dumps(data, indent=2))
        This function can then be reused across all commands.
  2. Improve Error Messages:

    • Some error messages are generic and do not provide enough context to the user.
      • Example:
        print(json.dumps({"status": "error", "message": "Policy linting failed", "type": "InternalError"}, indent=2))
      • Recommendation: Include more specific details in error messages, such as the exact cause of the failure and potential steps to resolve it.
  3. Test Coverage for JSON Output:

    • Ensure that the new --json flag is thoroughly tested for all commands, including edge cases such as invalid input, missing files, and unexpected errors.
      • Recommendation: Add unit tests for each command to verify the correctness of the JSON output. Use tools like pytest and pytest-mock to mock inputs and validate outputs.
  4. Use Logging Instead of Printing Errors:

    • The code uses print statements to display errors, which is not ideal for production-grade applications.
      • Example:
        print(f"Error: {e}", file=sys.stderr)
      • Recommendation: Use the logging module to log errors. This provides more flexibility and allows for better integration with monitoring tools.
  5. Use Constants for JSON Keys:

    • JSON keys are hardcoded throughout the code, which can lead to inconsistencies and typos.
      • Example:
        print(json.dumps({"status": "error", "message": "Policy linting failed", "type": "InternalError"}, indent=2))
      • Recommendation: Define constants for JSON keys and reuse them across the codebase. For example:
        STATUS_KEY = "status"
        MESSAGE_KEY = "message"
        TYPE_KEY = "type"
  6. Document JSON Schema:

    • The JSON output format is not documented, which may lead to confusion for users and developers.
      • Recommendation: Provide a detailed JSON schema for each command's output in the documentation. This will help users understand the structure and content of the JSON output.

Summary

This PR introduces a valuable feature by standardizing JSON output across CLI commands. However, there are critical security issues, potential backward compatibility risks, and opportunities for improvement in code quality and maintainability. Addressing the highlighted issues and suggestions will ensure a more robust and secure implementation.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Mar 28, 2026
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This PR introduces a standardized --json output flag across all CLI commands in the repository, ensuring consistent, machine-readable JSON output. It also includes improvements in error handling, input validation, and output sanitization to enhance security and usability. While the changes are well-intentioned and address several important aspects, there are some areas that need attention to ensure correctness, security, and maintainability.


🔴 CRITICAL

  1. Inconsistent Error Handling in JSON Output

    • In several places, the error messages in the JSON output are vague and do not provide sufficient context for debugging. For example:
      print(json.dumps({"status": "fail", "error": "Governance verification failed", "type": "InternalError"}, indent=2))
      • This message does not provide any details about the actual exception or error that occurred, making it difficult for users to debug.
      • Recommendation: Include the exception message in the JSON output, but ensure sensitive information is sanitized. For example:
        print(json.dumps({"status": "fail", "error": str(e), "type": "InternalError"}, indent=2))
  2. Potential Information Disclosure in Error Messages

    • While the PR aims to sanitize error messages, there are still instances where raw exception messages are printed to the console or included in JSON output. For example:
      console.print(f"[red]Error: {e}[/red]")
      • This could lead to sensitive information being exposed in logs.
      • Recommendation: Use a centralized error-handling mechanism that sanitizes all error messages before displaying or logging them. Avoid exposing sensitive details such as file paths, stack traces, or internal system information.
  3. Regex Validation for Agent Identifiers

    • The regex used for validating agent identifiers in the audit command is overly permissive:
      if not re.match(r"^agent-[a-zA-Z0-9_-]+$|^did:agentmesh:[a-zA-Z0-9._-]+$", agent):
      • This allows potentially unsafe characters like . and _ in the did:agentmesh format.
      • Recommendation: Use a stricter regex pattern that adheres to the expected format of agent identifiers. For example:
        r"^agent-[a-zA-Z0-9_-]+$|^did:agentmesh:[a-z0-9]+(?:-[a-z0-9]+)*$"
  4. Missing Cryptographic Error Handling

    • In the register command, the AgentIdentity.create method is called, but there is no specific error handling for cryptographic operations (e.g., key generation or signing). If these operations fail, the error is caught by the generic Exception block, which may not provide sufficient context.
    • Recommendation: Add specific error handling for cryptographic operations to ensure that failures are logged and reported appropriately.

🟡 WARNING

  1. Potential Breaking Changes

    • The addition of the --json flag to existing CLI commands may break scripts or integrations that rely on the current output format. While this is not a breaking change to the API, it could still impact users who rely on the CLI.
    • Recommendation: Clearly document this change in the release notes and provide examples of how to use the new --json flag. Consider adding a deprecation warning for users who do not specify an output format, indicating that the default behavior may change in the future.
  2. Backward Compatibility with Existing JSON Output

    • If any CLI commands previously supported JSON output in a different format, this change may break compatibility with existing integrations.
    • Recommendation: Ensure that the new JSON output format is backward-compatible with any previously supported formats. If not, document the changes clearly in the release notes.

💡 SUGGESTIONS

  1. Centralized JSON Output Handling

    • The current implementation repeats the logic for generating JSON output in multiple places, leading to code duplication and potential inconsistencies.
    • Recommendation: Create a utility function (e.g., generate_json_response) that standardizes the structure and formatting of JSON output across all commands. This function can also handle error sanitization.
  2. Unit Tests for JSON Output

    • While the PR mentions that tests have been added, it is not clear if the JSON output has been thoroughly tested for all commands.
    • Recommendation: Add unit tests to validate the JSON output for all commands, including edge cases (e.g., missing or invalid input, exceptions).
  3. Use Pydantic for JSON Schema Validation

    • The PR mentions structured JSON schemas, but it is unclear if these schemas are being validated. Without validation, there is a risk of inconsistencies or errors in the output.
    • Recommendation: Use Pydantic models to define and validate the JSON schemas for each command's output. This will ensure type safety and consistency.
  4. Thread Safety

    • The PR does not address thread safety, which is a critical aspect of concurrent agent execution. For example, the register command writes to a file (identity.json) without any locking mechanism, which could lead to race conditions in concurrent scenarios.
    • Recommendation: Use file locks or other synchronization mechanisms to ensure thread safety when performing file I/O operations.
  5. Documentation Updates

    • While the PR includes some updates to the documentation, it is unclear if all new features and changes have been documented.
    • Recommendation: Ensure that all new commands, options, and JSON schemas are thoroughly documented in the relevant sections of the documentation. Include examples of both successful and error JSON outputs.
  6. Error Logging

    • The PR introduces several print statements for error handling, but these messages are not logged. This makes it difficult to debug issues in production environments.
    • Recommendation: Use a logging framework to log error messages, including stack traces for unexpected exceptions. This will help with debugging and monitoring.
  7. Code Style and Readability

    • There are some minor code style issues, such as inconsistent use of single and double quotes, and redundant imports (e.g., import json is repeated in multiple places).
    • Recommendation: Run ruff and black to ensure consistent code formatting and remove unused imports.

Summary of Actionable Items

  1. 🔴 Improve error handling in JSON output to include sanitized exception messages.
  2. 🔴 Use stricter regex for validating agent identifiers.
  3. 🔴 Add specific error handling for cryptographic operations in the register command.
  4. 🟡 Document potential breaking changes due to the new --json flag and ensure backward compatibility with existing JSON output formats.
  5. 💡 Create a centralized utility function for generating JSON responses.
  6. 💡 Add unit tests for JSON output, including edge cases.
  7. 💡 Use Pydantic models for JSON schema validation.
  8. 💡 Ensure thread safety in file I/O operations.
  9. 💡 Update documentation with examples of new features and JSON outputs.
  10. 💡 Replace print statements with a logging framework.
  11. 💡 Run ruff and black to ensure consistent code style.

Let me know if you need further clarification or assistance!

@imran-siddique imran-siddique enabled auto-merge (squash) March 29, 2026 20:34
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This PR introduces a standardized --json flag across all CLI commands in the microsoft/agent-governance-toolkit repository. The changes aim to provide structured, machine-readable JSON output for improved automation and integration with CI/CD pipelines. The implementation includes error handling, JSON schema sanitization, and validation for various commands.

While the feature is valuable, there are several areas that require attention, particularly around security, backward compatibility, and code quality. Below is a detailed review.


🔴 CRITICAL Issues

  1. Improper Error Handling in JSON Output:

    • In multiple places, the error messages returned in JSON format include generic error types like "InternalError" without sufficient details for debugging. While this is intended to prevent information leakage, it may hinder debugging in legitimate use cases.
    • Recommendation: Include a unique error code or a traceable identifier in the JSON output for unexpected errors. This allows debugging without exposing sensitive information.

    Example:

    print(json.dumps({
        "status": "error",
        "message": "An internal error occurred during registration.",
        "type": "InternalError",
        "error_code": "ERR_REG_001"  # Add unique error codes
    }, indent=2))
  2. Regex Validation for Agent Identifiers:

    • The regex used for validating agent identifiers in the audit command is overly permissive and could allow injection attacks.
    • Recommendation: Use stricter regex patterns and document the expected format of agent identifiers. For example:
      if not re.match(r"^did:agentmesh:[a-zA-Z0-9._-]{1,64}$", agent):
  3. Potential Information Leakage in Audit Logs:

    • While the PR introduces key whitelisting for audit logs, there is no validation of the values being output. Malicious or malformed data could still be included in the logs.
    • Recommendation: Sanitize all values in the audit log entries to ensure they do not contain sensitive or malicious content.
  4. Concurrent Execution Risks:

    • The register command writes the agent identity to a file (identity.json) without any locking mechanism. This could lead to race conditions in concurrent executions.
    • Recommendation: Use file locks (e.g., fcntl or portalocker) to ensure atomic writes to the identity.json file.

🟡 WARNING Issues

  1. Backward Compatibility:

    • Adding a --json flag to existing commands is a non-breaking change. However, the behavior of commands in non-JSON mode has been altered (e.g., additional console messages). This could break existing scripts that rely on specific output formats.
    • Recommendation: Clearly document these changes in the release notes and consider providing a way to disable the new behavior for users who rely on the old output format.
  2. Error Handling in Non-JSON Mode:

    • In non-JSON mode, errors are printed directly to stderr without consistent formatting.
    • Recommendation: Standardize error messages in non-JSON mode to improve usability and debugging.

💡 Suggestions

  1. Centralize JSON Error Handling:

    • The error handling logic for JSON output is duplicated across multiple commands. This increases the risk of inconsistencies and bugs.
    • Recommendation: Create a utility function (e.g., format_error) to generate standardized JSON error responses.

    Example:

    def format_error(message: str, error_type: str, error_code: Optional[str] = None) -> str:
        error_response = {
            "status": "error",
            "message": message,
            "type": error_type,
        }
        if error_code:
            error_response["error_code"] = error_code
        return json.dumps(error_response, indent=2)
  2. Use Pydantic for JSON Schema Validation:

    • The PR introduces structured JSON output but does not validate the structure against a schema. This could lead to inconsistencies or missing fields.
    • Recommendation: Use Pydantic models to define and validate the JSON schema for each command's output.

    Example:

    from pydantic import BaseModel
    
    class AuditLogEntry(BaseModel):
        timestamp: str
        agent: str
        action: str
        status: str
  3. Improve Test Coverage:

    • While the PR mentions that tests have been added, it is unclear if all edge cases are covered, especially for error handling and JSON output.
    • Recommendation: Add tests for:
      • Invalid --json flag usage.
      • Error scenarios (e.g., missing files, invalid input).
      • Concurrent execution of commands that write to files.
  4. Logging for Unexpected Errors:

    • The except Exception blocks suppress error details in JSON mode. This could make debugging difficult.
    • Recommendation: Log the full stack trace of unexpected errors to a secure location (e.g., a log file or monitoring system).
  5. Use Context Managers for File Operations:

    • Some file operations (e.g., writing to identity.json) do not use context managers, which could lead to resource leaks.
    • Recommendation: Use with statements for all file operations.
  6. Refactor CLI Commands for Readability:

    • Some CLI commands (e.g., register, status) are becoming too long and complex.
    • Recommendation: Break down these commands into smaller, reusable functions to improve readability and maintainability.
  7. Document JSON Schemas:

    • While the PR mentions that JSON schemas have been added, they are not explicitly documented.
    • Recommendation: Include JSON schema definitions in the documentation to help users understand the structure of the output.
  8. Use Constants for Repeated Strings:

    • Strings like "status", "error", and "InternalError" are repeated throughout the code.
    • Recommendation: Define these as constants to improve maintainability and reduce the risk of typos.

Final Assessment

  • Security: 🔴 Critical issues need to be addressed to prevent potential security vulnerabilities, such as injection attacks and information leakage.
  • Backward Compatibility: 🟡 While the changes are mostly non-breaking, the altered behavior in non-JSON mode could impact existing users.
  • Code Quality: 💡 Several areas for improvement, including error handling, code reuse, and documentation.

Action Items

  1. Address the critical security issues, especially around error handling, regex validation, and audit log sanitization.
  2. Review and document potential backward compatibility concerns.
  3. Refactor the code to improve readability and maintainability.
  4. Add comprehensive test coverage for all new features and edge cases.
  5. Document the JSON schemas for user reference.

Once these issues are resolved, the PR will be ready for approval.

imran-siddique
imran-siddique previously approved these changes Mar 30, 2026
Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed: all mandatory checks pass per copilot-instructions.md.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Code Review for PR: feat: Add standardized --json output support across all CLI commands


🔴 CRITICAL: Security Issues

  1. Unvalidated Input in audit Command:

    • The audit command filters entries based on the agent parameter. While a regex is used to validate the format, it is overly permissive (^agent-[a-zA-Z0-9_-]+$|^did:agentmesh:[a-zA-Z0-9._-]+$). This could allow injection attacks if the agent parameter is used in database queries or file operations.
    • Recommendation: Use stricter validation rules for agent identifiers. If the agent parameter is used in database queries, ensure proper parameterized queries or ORM methods are used to prevent SQL injection.
  2. Insufficient Error Handling in audit Command:

    • The audit command uses a catch-all exception handler (except:) for importing the proxy and trust_cli modules. This can mask critical errors and make debugging difficult.
    • Recommendation: Replace the generic except: with specific exception handling (e.g., except ImportError:) to avoid suppressing unrelated errors.
  3. Error Message Disclosure:

    • In several places, raw exception messages are returned in the JSON output (e.g., policy command). This could lead to sensitive information disclosure.
    • Recommendation: Avoid exposing raw exception messages in the JSON output. Instead, return generic error messages and log the detailed error for debugging purposes.
  4. Potential Information Disclosure in Audit Logs:

    • The audit command includes a key whitelisting mechanism for JSON output, but it does not enforce strict type checking for values. This could lead to unintentional leakage of sensitive data.
    • Recommendation: Implement stricter validation for the values of whitelisted keys to ensure no sensitive information is leaked.

🟡 WARNING: Potential Breaking Changes

  1. Behavior Change for CLI Commands:

    • Adding the --json flag changes the behavior of existing CLI commands. While this is a non-breaking change for users who do not use the flag, it could potentially break scripts or integrations that rely on the previous output format.
    • Recommendation: Clearly document this change in the release notes and provide examples of the new JSON output for each command.
  2. Error Message Format Change:

    • The new error handling mechanism changes the format of error messages for CLI commands when the --json flag is used. This could break existing integrations that parse error messages.
    • Recommendation: Highlight this change as a potential breaking change in the release notes.

💡 Suggestions for Improvement

  1. Centralize JSON Error Handling:

    • The JSON error handling logic is repeated across multiple commands. This can lead to inconsistencies and maintenance challenges.
    • Recommendation: Create a utility function (e.g., generate_error_response) to standardize JSON error responses. This function can take parameters like message, type, and status to generate consistent error outputs.
  2. Use Pydantic for JSON Schema Validation:

    • The JSON output for commands like audit, policy, and status is manually constructed. This increases the risk of inconsistencies and errors.
    • Recommendation: Use Pydantic models to define and validate the JSON schemas for each command's output. This will ensure type safety and consistency.
  3. Thread Safety in Concurrent Execution:

    • While the PR does not explicitly introduce concurrency, the CLI commands may be executed in parallel in a CI/CD pipeline. Ensure that shared resources (e.g., log files, configuration files) are accessed in a thread-safe manner.
    • Recommendation: Use file locks or other synchronization mechanisms when accessing shared resources.
  4. Improve Test Coverage:

    • The PR mentions that tests have been added, but there is no evidence of comprehensive test coverage for the new --json functionality.
    • Recommendation: Add unit tests and integration tests to validate the JSON output for all commands. Include tests for both success and error scenarios.
  5. Backward Compatibility for CLI Output:

    • While the --json flag is optional, it is important to ensure that the default output format remains unchanged for users who do not use the flag.
    • Recommendation: Add tests to verify that the default output format is consistent with the previous behavior.
  6. Error Logging:

    • While sanitized error messages are returned in JSON output, the detailed error information is not logged.
    • Recommendation: Log detailed error information (e.g., stack traces) to a secure location for debugging purposes.
  7. Documentation:

    • The documentation updates are a good start, but they lack detailed examples of the new JSON output for each command.
    • Recommendation: Add detailed examples of the JSON output for each command in the documentation. This will help users understand the new functionality and adapt their workflows accordingly.

Summary

  • 🔴 CRITICAL: Address unvalidated input in the audit command, improve error handling, and prevent information disclosure in error messages and audit logs.
  • 🟡 WARNING: Highlight potential breaking changes in the release notes.
  • 💡 SUGGESTION: Centralize JSON error handling, use Pydantic for schema validation, ensure thread safety, improve test coverage, and enhance documentation.

Let me know if you need further clarification or assistance!

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This PR introduces a standardized --json flag across all CLI commands in the microsoft/agent-governance-toolkit repository, enabling structured JSON output for machine-readable results. This is a significant improvement for integration with CI/CD pipelines and automation tools. The PR also includes enhancements to error handling, input validation, and output sanitization, which improve security and usability.

Below is the detailed review of the changes:


🔴 CRITICAL Issues

  1. Error Handling in JSON Output:

    • In several places, the error messages returned in JSON format include the exception message (str(e)), which could potentially leak sensitive information about the internal state of the application. For example:
      except (yaml.YAMLError, ValueError, json.JSONDecodeError) as e:
          if output_json:
              print(json.dumps({"status": "error", "message": str(e), "type": e.__class__.__name__}, indent=2))
      Impact: This could lead to information disclosure (CWE-209) if the exception message contains sensitive data.
      Recommendation: Replace str(e) with a generic error message (e.g., "An error occurred while processing the request") and log the detailed exception message only in the server logs.
  2. Regex Validation for Agent Identifiers:

    • The regex used to validate agent identifiers in the audit command is overly permissive:
      if not re.match(r"^agent-[a-zA-Z0-9_-]+$|^did:agentmesh:[a-zA-Z0-9._-]+$", agent):
      Impact: This regex allows potentially dangerous characters like . and _ in the did:agentmesh format, which could be exploited in injection attacks or invalid lookups.
      Recommendation: Tighten the regex to explicitly match the expected format of agent identifiers. For example:
      r"^agent-[a-zA-Z0-9-]+$|^did:agentmesh:[a-zA-Z0-9-]+$"
  3. Missing Cryptographic Validation:

    • The AgentIdentity.create method is used to generate identities, but there is no indication that the generated identities are cryptographically validated (e.g., verifying the integrity of the DID or public key).
      Impact: This could lead to trust/identity issues if the generated identities are not securely validated.
      Recommendation: Ensure that the AgentIdentity.create method includes cryptographic validation of the generated identities and document this process.

🟡 WARNING: Potential Breaking Changes

  1. Behavior Change in CLI Commands:

    • Adding the --json flag changes the behavior of existing CLI commands. While this is a non-breaking change for users who do not use the flag, it could cause issues for scripts or tools that rely on the previous output format.
      Recommendation: Clearly document this change in the release notes and provide examples of how to use the --json flag.
  2. Error Output Format:

    • The error output format has been standardized to JSON when the --json flag is used. This could break existing scripts that parse error messages from the CLI.
      Recommendation: Highlight this change in the documentation and provide a migration guide for users.

💡 Suggestions for Improvement

  1. Centralize JSON Error Handling:

    • The JSON error handling logic is repeated across multiple commands. This increases the risk of inconsistencies and makes the code harder to maintain.
      Recommendation: Create a centralized utility function for generating JSON error responses. For example:
      def json_error(message: str, error_type: str = "InternalError") -> str:
          return json.dumps({"status": "error", "message": message, "type": error_type}, indent=2)
  2. Pydantic Model Validation:

    • The JSON schemas for structured output are not explicitly validated. This could lead to inconsistencies or invalid data being returned.
      Recommendation: Use Pydantic models to define and validate the JSON schemas for CLI outputs. For example:
      from pydantic import BaseModel
      
      class AgentStatus(BaseModel):
          agent_name: str
          did: Optional[str]
          trust_score: int
          max_score: int
          dimensions: dict
  3. Thread Safety:

    • There is no indication of whether the CLI commands are thread-safe. If these commands are executed concurrently (e.g., in a CI/CD pipeline), there could be race conditions or data corruption.
      Recommendation: Audit the code for thread safety and document any limitations or guarantees.
  4. Test Coverage:

    • While the PR mentions that tests have been added, there is no indication of the specific test cases covered.
      Recommendation: Ensure that the following scenarios are tested:
      • Valid and invalid usage of the --json flag.
      • Error handling and sanitization for all commands.
      • Validation of agent identifiers and other inputs.
  5. Backward Compatibility:

    • The PR does not explicitly mention whether the changes are backward-compatible with older versions of the toolkit.
      Recommendation: Add a compatibility matrix to the documentation to clarify which versions of the toolkit support the --json flag.

✅ Positive Aspects

  1. Improved Security:

    • The PR includes several security improvements, such as sanitized JSON error output and regex-based input validation.
  2. Enhanced Usability:

    • The addition of the --json flag makes the CLI tools more suitable for automation and integration with external systems.
  3. Documentation Updates:

    • The documentation has been updated to reflect the new features and security improvements.

Final Recommendation

  • Address the 🔴 CRITICAL issues related to error handling, regex validation, and cryptographic validation.
  • Document the 🟡 WARNING potential breaking changes and provide a migration guide for users.
  • Implement the 💡 SUGGESTIONS to improve code maintainability, test coverage, and thread safety.

Once these issues are resolved, the PR will be ready for approval.

Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on --json output! Two blocking issues before merge:

  1. Bare \�xcept:\ clauses in \main.py\ — these catch \SystemExit, \KeyboardInterrupt, etc. Please change to \�xcept ImportError:.

  2. \init_integration\ command body deleted — the entire _init_claude_integration()\ function was replaced with \pass, making the command a no-op while still accepting flags. Please keep the existing implementation or document the removal.

Happy to re-review once fixed!

auto-merge was automatically disabled March 31, 2026 04:27

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-mesh agent-mesh package documentation Improvements or additions to documentation size/XL Extra large PR (500+ lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants