Skip to content

fix(ci): fix ruff lint errors and policy provider test mocks#653

Merged
imran-siddique merged 1 commit intomicrosoft:mainfrom
imran-siddique:fix/ci-lint-tests
Mar 31, 2026
Merged

fix(ci): fix ruff lint errors and policy provider test mocks#653
imran-siddique merged 1 commit intomicrosoft:mainfrom
imran-siddique:fix/ci-lint-tests

Conversation

@imran-siddique
Copy link
Copy Markdown
Member

Fixes CI failures on main:

  • lint (agent-sre): Remove unused imports (time, os) and split multi-import line in persistence.py
  • test (agent-mesh): Add label()/__str__() to _StubDecision mock, fix assertion to match actual call signature

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@imran-siddique imran-siddique merged commit b9ac7a8 into microsoft:main Mar 31, 2026
24 of 25 checks passed
@github-actions
Copy link
Copy Markdown

🤖 AI Agent: test-generator — `packages/agent-sre/src/agent_sre/slo/persistence.py`

🧪 Test Coverage Analysis

packages/agent-sre/src/agent_sre/slo/persistence.py

  • Existing coverage:

    • The _validate_db_path function appears to have some test coverage, as it is a utility function that likely interacts with database initialization or configuration. However, the specific changes in this PR (removal of unused imports and splitting multi-import lines) do not directly affect functionality and are unlikely to introduce new testable behavior.
    • The rest of the file's functionality (e.g., database operations, threading, and context management) is assumed to have existing tests, but this PR does not modify those areas.
  • Missing coverage:

    • There is no indication in the PR or diff that _validate_db_path is explicitly tested for edge cases such as malformed paths, invalid characters, or unexpected input types.
    • The function's behavior when raw is not ":memory:" (e.g., handling invalid file paths or permissions) may not be fully covered.
  • 💡 Suggested test cases:

    1. test_validate_db_path_memory — Ensure _validate_db_path correctly handles the special ":memory:" input and returns it unchanged.
    2. test_validate_db_path_valid_path — Test _validate_db_path with a valid file path and verify it returns the normalized, resolved path.
    3. test_validate_db_path_invalid_path — Test _validate_db_path with an invalid file path (e.g., containing illegal characters or pointing to a non-writable directory) and ensure it raises an appropriate exception.
    4. test_validate_db_path_edge_cases — Test _validate_db_path with edge cases like an empty string, None, or excessively long file paths to ensure robust input validation.

packages/agent-mesh/tests/test_policy_provider.py

  • Existing coverage:

    • The changes to _StubDecision (adding label() and __str__() methods) are likely covered indirectly by existing tests that use _StubDecision in policy evaluation scenarios.
    • The fix to the assertion in test_passes_agent_id_to_engine ensures that the test now correctly validates the action parameter passed to the engine. This test is explicitly covered.
  • Missing coverage:

    • The new label() and __str__() methods in _StubDecision do not appear to have direct test coverage. While they may be indirectly tested through other tests, explicit tests for these methods would ensure their behavior is validated.
  • 💡 Suggested test cases:

    1. test_stub_decision_label — Verify that the label() method of _StubDecision correctly returns the action attribute.
    2. test_stub_decision_str_with_reason — Test the __str__() method of _StubDecision when the reason attribute is set, ensuring it returns the reason.
    3. test_stub_decision_str_without_reason — Test the __str__() method of _StubDecision when the reason attribute is None, ensuring it falls back to returning the action.

@github-actions github-actions bot added tests agent-mesh agent-mesh package agent-sre agent-sre package size/S Small PR (< 50 lines) labels Mar 31, 2026
@github-actions
Copy link
Copy Markdown

🤖 AI Agent: security-scanner — Security Review of Pull Request

Security Review of Pull Request

This pull request primarily addresses CI linting errors and test mock adjustments. While the changes seem minor and focused on code quality and test correctness, they still require a security review since the repository is a critical security layer.


Findings

1. Policy Engine Circumvention

  • Code Context:
    def label(self) -> str:
        return self.action
    
    def __str__(self) -> str:
        return self.reason if self.reason else self.action
  • Analysis: The addition of the label() and __str__() methods to the _StubDecision mock is intended to improve test readability and debugging. However, if _StubDecision is used in production code or test cases that simulate real-world scenarios, there is a risk that these methods could inadvertently bypass policy checks. For example, if a policy engine relies on string representations or labels for decision-making, this could introduce a subtle vulnerability where crafted inputs manipulate these methods.
  • Attack Vector: An attacker could craft inputs that exploit the __str__() or label() methods to bypass policy enforcement by returning unexpected or misleading values.
  • Rating: 🟠 HIGH
  • Recommendation: Ensure _StubDecision is strictly confined to test environments and cannot leak into production code. Add explicit checks in the policy engine to validate decisions against expected formats and values.

2. Prompt Injection Defense Bypass

  • Code Context: No direct changes in this PR relate to prompt injection defenses.
  • Analysis: No new vulnerabilities introduced in this area.
  • Rating: 🔵 LOW

3. Trust Chain Weaknesses

  • Code Context: No changes in this PR relate to SPIFFE/SVID validation or certificate pinning.
  • Analysis: No new vulnerabilities introduced in this area.
  • Rating: 🔵 LOW

4. Credential Exposure

  • Code Context: No changes in this PR involve logging, error messages, or debug output that could expose credentials.
  • Analysis: No new vulnerabilities introduced in this area.
  • Rating: 🔵 LOW

5. Sandbox Escape

  • Code Context: No changes in this PR involve container or process isolation mechanisms.
  • Analysis: No new vulnerabilities introduced in this area.
  • Rating: 🔵 LOW

6. Deserialization Attacks

  • Code Context: No changes in this PR involve deserialization of pickle, YAML, or JSON.
  • Analysis: No new vulnerabilities introduced in this area.
  • Rating: 🔵 LOW

7. Race Conditions

  • Code Context: No changes in this PR involve concurrent trust evaluation or time-of-check-to-time-of-use (TOCTOU) issues.
  • Analysis: No new vulnerabilities introduced in this area.
  • Rating: 🔵 LOW

8. Supply Chain

  • Code Context:
    import os, tempfile as _tempfile
  • Analysis: The change to remove os and split the import line does not introduce any new supply chain risks. However, the use of tempfile should be reviewed to ensure it is used securely (e.g., avoiding predictable temporary file names).
  • Rating: 🔵 LOW

Summary of Findings

Category Rating Details
Prompt Injection Defense Bypass 🔵 LOW No new vulnerabilities introduced.
Policy Engine Circumvention 🟠 HIGH Potential misuse of label() and __str__() in _StubDecision could bypass policy checks.
Trust Chain Weaknesses 🔵 LOW No new vulnerabilities introduced.
Credential Exposure 🔵 LOW No new vulnerabilities introduced.
Sandbox Escape 🔵 LOW No new vulnerabilities introduced.
Deserialization Attacks 🔵 LOW No new vulnerabilities introduced.
Race Conditions 🔵 LOW No new vulnerabilities introduced.
Supply Chain 🔵 LOW No new vulnerabilities introduced.

Recommendations

  1. Policy Engine Circumvention:

    • Ensure _StubDecision is used exclusively in test environments.
    • Add validation logic in the policy engine to ensure decisions conform to expected formats and values, regardless of string representations or labels.
  2. General:

    • Conduct a broader review of the policy engine to ensure no reliance on string representations or labels for critical decision-making.
    • Add comments to clarify that _StubDecision is for testing purposes only.

Final Rating: 🟠 HIGH

The primary concern is the potential for policy circumvention via the new methods in _StubDecision. While this is a test mock, the risk of misuse or leakage into production code warrants a high severity rating. Addressing this issue should be prioritized.

@github-actions
Copy link
Copy Markdown

🤖 AI Agent: docs-sync-checker — Issues Found

📝 Documentation Sync Report

Issues Found

  • _StubDecision.label() in packages/agent-mesh/tests/test_policy_provider.py — missing docstring
  • _StubDecision.__str__() in packages/agent-mesh/tests/test_policy_provider.py — missing docstring
  • ⚠️ CHANGELOG.md — no entry for this change

Suggestions

  • 💡 Add docstring for _StubDecision.label(self) -> str explaining its purpose and return value.
  • 💡 Add docstring for _StubDecision.__str__(self) -> str explaining its purpose and return value.
  • 💡 Add a CHANGELOG entry under a relevant section (e.g., "Fixes") to document the lint fixes and test mock updates.

Notes

  • The changes in packages/agent-sre/src/agent_sre/slo/persistence.py are minor refactoring (removal of unused imports and splitting multi-import lines) and do not require documentation updates.
  • No updates are needed for README.md or example code, as this PR does not introduce new features or modify API signatures.

Action Required

Please address the missing docstrings and update the CHANGELOG.md to ensure documentation is in sync.

@github-actions
Copy link
Copy Markdown

🤖 AI Agent: breaking-change-detector — Summary

🔍 API Compatibility Report

Summary

This pull request primarily addresses CI lint errors and test mock adjustments. No changes to public APIs were detected in the provided diff. ✅

Findings

Severity Package Change Impact
agent-mesh Added label() and __str__() methods to _StubDecision mock No impact on public API
agent-sre Removed unused imports (time, os) No impact on public API

Migration Guide

No migration steps are necessary as no breaking changes were identified.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This pull request addresses CI failures by fixing linting errors in the agent-sre package and updating test mocks in the agent-mesh package. The changes are relatively small, but they touch on critical areas like policy evaluation and thread safety. Below is a detailed review of the changes.


Detailed Feedback

1. Policy Engine Correctness

  • File: test_policy_provider.py
  • Change: Added label() and __str__() methods to _StubDecision mock.
  • Analysis:
    • Adding label() and __str__() to the _StubDecision mock improves test coverage and aligns the mock with the expected interface of the actual Decision class. This is a good practice for ensuring test reliability.
    • 🔴 CRITICAL: The assertion change in test_passes_agent_id_to_engine modifies the expected behavior from checking the agent_id to checking the action. This could indicate a misunderstanding of the policy engine's behavior or a potential regression in functionality. The agent_id is a critical input for policy evaluation, and bypassing its validation could lead to security issues.
      • Actionable Fix: Verify whether the policy engine is designed to evaluate based on agent_id. If so, revert the assertion to check agent_id and ensure the test mock passes the correct arguments.

2. Trust/Identity

  • No changes directly affect cryptographic operations, credential handling, or SPIFFE/SVID in this PR.

3. Sandbox Escape Vectors

  • No changes introduce or mitigate sandbox escape vectors.

4. Thread Safety

  • File: persistence.py
  • Change: Removed unused imports (time, os) and split multi-import line.
  • Analysis:
    • The changes are cosmetic and do not affect thread safety. However, the persistence.py file uses threading and sqlite3, which are often sources of concurrency issues.
    • 💡 SUGGESTION: Review the use of threading and sqlite3 in this file to ensure proper locking mechanisms are in place, as SQLite is not thread-safe by default.

5. OWASP Agentic Top 10 Compliance

  • No changes directly impact compliance with the OWASP Agentic Top 10. However, the potential issue with the agent_id assertion (see above) could lead to improper access control, violating principles like "Least Privilege" and "Fail-Safe Defaults."

6. Type Safety and Pydantic Model Validation

  • No changes involve Pydantic models or type safety. However, the _StubDecision mock could benefit from type annotations for its methods (label() and __str__()).
    • 💡 SUGGESTION: Add type annotations to _StubDecision methods to improve type safety and readability.

7. Backward Compatibility

  • File: test_policy_provider.py
  • Change: Modified test assertion.
  • 🟡 WARNING: If the assertion change reflects a change in the policy engine's behavior, this could be a breaking change for users relying on the previous behavior. Ensure that this change is intentional and documented.

Summary of Actionable Feedback

  1. 🔴 CRITICAL: Verify the correctness of the assertion change in test_passes_agent_id_to_engine. If the policy engine depends on agent_id, revert the change and update the test mock to pass the correct arguments.
  2. 💡 SUGGESTION: Add type annotations to _StubDecision methods (label() and __str__()).
  3. 💡 SUGGESTION: Review persistence.py for potential thread-safety issues with sqlite3 and threading.
  4. 🟡 WARNING: If the assertion change reflects a change in the policy engine's behavior, document it as a breaking change.

Suggested Next Steps

  • Address the critical issue with the agent_id assertion.
  • Add type annotations to _StubDecision.
  • Conduct a broader review of thread safety in persistence.py if not already done.
  • Confirm whether the assertion change introduces a breaking change and update documentation accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-mesh agent-mesh package agent-sre agent-sre package size/S Small PR (< 50 lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant