fix(security): remaining findings - workflow_run, Docker tags, dep confusion script by imran-siddique · Pull Request #690 · microsoft/agent-governance-toolkit

imran-siddique · 2026-04-02T03:47:07Z

Security Audit - Batch 6: Remaining Medium Findings

Changes

workflow_run validation (CWE-918): Added conclusion==success check to ai-pr-summary.yml
NuGet API key (CWE-532): Moved key from inline script to env: block in ESRP pipeline
Docker pinning (CWE-1104): Replaced :latest tags with specific versions (Prometheus v3.4.0, Grafana 11.6.0, Qdrant v1.14.1)
Dep confusion script (coverage gap): Extended check_dependency_confusion.py to scan pyproject.toml, package.json, and Cargo.toml; flag local-only packages as HIGH RISK

Files Changed (5)

ai-pr-summary.yml, esrp-publish.yml, 2 docker-compose.yml, check_dependency_confusion.py

…nfusion script - Add conclusion==success check to workflow_run in ai-pr-summary.yml to prevent posting from failed/cancelled runs (CWE-918) - Pass NuGet API key via env: block in ESRP pipeline to prevent log exposure (CWE-532) - Pin Docker images: prom/prometheus:v3.4.0, grafana/grafana:11.6.0, qdrant/qdrant:v1.14.1 — replacing :latest tags (CWE-1104) - Extend check_dependency_confusion.py to scan pyproject.toml, package.json, and Cargo.toml dependencies (coverage gap fix) - Flag local-only packages (agent-primitives, emk) as HIGH RISK when found with version pins instead of path references Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-04-02T03:47:34Z

🤖 AI Agent: security-scanner — Security Review of Pull Request

Security Review of Pull Request

1. Prompt Injection Defense Bypass

No changes in this PR directly relate to prompt injection or user input sanitization. The changes primarily focus on workflow validation, Docker image pinning, and dependency confusion checks. No issues found.

Rating: 🔵 LOW

2. Policy Engine Circumvention

The addition of conclusion == 'success' in the workflow_run validation ensures that the workflow only triggers when the triggering workflow completes successfully. This is a positive change that strengthens the policy enforcement mechanism by preventing execution on failed workflows.

Rating: 🔵 LOW

3. Trust Chain Weaknesses

The PR does not introduce or modify any trust chain mechanisms, such as SPIFFE/SVID validation or certificate pinning. No issues found.

Rating: 🔵 LOW

4. Credential Exposure

The movement of the NuGet API key from an inline script to the env block in the ESRP pipeline is a positive change. This reduces the risk of accidental credential exposure in logs or error messages. However, the PR does not explicitly confirm that the env block is configured to prevent logging of sensitive values. If the environment variable is not marked as secret, it could still be exposed in logs.

Attack Vector: If the NUGET_API_KEY environment variable is not marked as secret, it could be logged during pipeline execution, leading to credential exposure.

Suggested Fix: Ensure that the NUGET_API_KEY environment variable is marked as secret in the pipeline configuration. For example, in Azure DevOps, this can be done by storing the key in a secure variable group.

Rating: 🟡 MEDIUM

5. Sandbox Escape

The PR does not introduce or modify any code that could lead to container or process isolation breakouts. No issues found.

Rating: 🔵 LOW

6. Deserialization Attacks

The check_dependency_confusion.py script has been updated to parse pyproject.toml, package.json, and Cargo.toml files. While the script uses json.load for package.json files, which is safe for JSON parsing, the pyproject.toml and Cargo.toml files are parsed using custom logic. This could potentially introduce vulnerabilities if the parsing logic is not robust.

Attack Vector: Maliciously crafted pyproject.toml or Cargo.toml files could exploit weaknesses in the custom parsing logic, leading to code execution or denial of service.

Suggested Fix: Use a well-maintained library for parsing TOML files, such as tomli (for Python 3.11 and below) or the built-in tomllib (for Python 3.11 and above). This reduces the risk of introducing vulnerabilities through custom parsing.

Rating: 🟠 HIGH

7. Race Conditions

The changes to the ai-pr-summary.yml workflow ensure that the job only runs if the triggering workflow concludes successfully. This reduces the risk of race conditions related to policy checks. No issues found.

Rating: 🔵 LOW

8. Supply Chain

The PR addresses dependency confusion by extending the check_dependency_confusion.py script to scan additional manifest files (pyproject.toml, package.json, and Cargo.toml). This is a positive change that improves the detection of potential dependency confusion vulnerabilities.

Potential Issue: The script relies on hardcoded lists of registered package names for PyPI, npm, and Cargo. If these lists are not regularly updated, there is a risk of false positives or negatives.

Attack Vector: An attacker could exploit outdated or incomplete package lists to introduce malicious dependencies that bypass the checks.

Suggested Fix: Implement an automated mechanism to periodically update the lists of registered packages from their respective registries (e.g., using APIs provided by PyPI, npm, and crates.io). Additionally, consider logging and alerting for any flagged packages to ensure timely review.

Rating: 🟡 MEDIUM

Summary of Findings

Prompt Injection Defense Bypass: 🔵 LOW
Policy Engine Circumvention: 🔵 LOW
Trust Chain Weaknesses: 🔵 LOW
Credential Exposure: 🟡 MEDIUM
- Fix: Ensure NUGET_API_KEY is marked as secret in the pipeline configuration.
Sandbox Escape: 🔵 LOW
Deserialization Attacks: 🟠 HIGH
- Fix: Use a robust library like tomli or tomllib for parsing TOML files.
Race Conditions: 🔵 LOW
Supply Chain: 🟡 MEDIUM
- Fix: Automate updates to registered package lists and implement logging/alerting for flagged packages.

Recommendations

Address the deserialization attack risk by replacing custom TOML parsing logic with a secure library.
Ensure the NUGET_API_KEY is securely stored and not logged in the pipeline.
Automate the update of registered package lists to reduce the risk of supply chain attacks.

These changes will further strengthen the security posture of the microsoft/agent-governance-toolkit repository.

github-actions

🤖 AI Agent: code-reviewer

Review Summary

This pull request addresses several medium-severity security findings, including improvements to GitHub Actions workflow validation, secure handling of secrets, Docker image pinning, and enhancements to a dependency confusion detection script. While the changes are generally well-implemented, there are a few areas that require attention to ensure correctness, security, and maintainability.

🔴 CRITICAL

Potential Regular Expression Denial of Service (ReDoS) in check_pyproject_toml and check_cargo_toml:
- The regular expressions used in check_pyproject_toml and check_cargo_toml are vulnerable to catastrophic backtracking, especially when processing large or malformed files. For example:
```
dep_re = re.compile(r'^[\s"]*([a-zA-Z0-9_-]+)', re.MULTILINE)
```
  - This regex can cause performance issues when encountering long strings with many leading spaces or quotes.
  - Fix: Use a more specific regex or limit the number of repetitions for quantifiers. For example:
```
dep_re = re.compile(r'^[\s"]{0,10}([a-zA-Z0-9_-]+)', re.MULTILINE)
```
Insecure Handling of NUGET_API_KEY:
- While moving the NUGET_API_KEY to the env block in the ESRP pipeline is an improvement, the key is still being passed as a command-line argument to the dotnet nuget push command. Command-line arguments can be exposed in process lists or logs.
- Fix: Use a secure method to pass the API key, such as a secure file or environment variable directly accessed by the tool. For example:
```
script: |
  echo "$NUGET_API_KEY" | dotnet nuget push "$(Pipeline.Workspace)/nuget-publish/**/*.nupkg" \
    --source https://api.nuget.org/v3/index.json \
    --api-key-stdin \
    --skip-duplicate
```

🟡 WARNING

Breaking Change in Docker Image Tags:
- Replacing :latest with specific versions for Docker images (Prometheus, Grafana, Qdrant) is a good security practice. However, this is a breaking change if users rely on the :latest tag for automatic updates.
- Recommendation: Clearly document this change in the release notes and migration guide, and provide instructions for users to update their configurations if needed.
Dependency Confusion Script Behavior:
- The updated check_dependency_confusion.py script now flags local-only packages (e.g., agent-primitives, emk) as "HIGH RISK" if they are found with version pins instead of path references. This is a breaking change for users who may have been using version pins for local packages.
- Recommendation: Document this change in the release notes and provide guidance on how users should update their dependency files to avoid false positives.

💡 SUGGESTIONS

Improved Error Handling in check_dependency_confusion.py:
- The check_pyproject_toml, check_package_json, and check_cargo_toml functions silently skip files that cannot be opened or parsed. This could lead to missed findings.
- Suggestion: Log a warning when a file cannot be processed, e.g.:
```
except (OSError, UnicodeDecodeError) as e:
    findings.append(f"  Error reading {filepath}: {e}")
```
Unit Tests for Dependency Confusion Script:
- The new functionality in check_dependency_confusion.py (e.g., scanning pyproject.toml, package.json, and Cargo.toml) should be covered by unit tests to ensure correctness.
- Suggestion: Add test cases for each new file type, including edge cases like malformed files, missing sections, and valid/invalid dependencies.

Performance Optimization for Dependency Scanning:

The script performs multiple recursive directory scans (glob.glob("**/*.ipynb", recursive=True) and similar). This could be optimized by performing a single recursive scan and filtering files by extension.

Suggestion: Refactor to use a single directory traversal, e.g.:

for root, _, files in os.walk("."):
    for file in files:
        if file.endswith(".ipynb"):
            all_findings.extend(check_notebook(os.path.join(root, file)))
        elif file == "pyproject.toml":
            all_findings.extend(check_pyproject_toml(os.path.join(root, file)))
        # Add other file types here

Enhance Logging in check_dependency_confusion.py:
- The script currently prints findings directly to stdout. Consider using Python's logging module for better control over log levels and output formatting.
Backward Compatibility for Docker Tags:
- To minimize disruption, consider providing a fallback mechanism for users who still rely on :latest tags. For example:
```
image: ${QDRANT_IMAGE:-qdrant/qdrant:v1.14.1}
```
  This allows users to override the default version by setting an environment variable.
Validation for workflow_run Conclusion:
- The addition of conclusion == 'success' in ai-pr-summary.yml is a good improvement. However, consider adding a comment explaining why this check is necessary for future maintainability.

Final Assessment

The changes address important security issues and improve the overall robustness of the repository.
However, the identified critical issues (ReDoS vulnerability and insecure API key handling) should be addressed before merging.
The breaking changes should be clearly documented to ensure a smooth transition for users.
Implementing the suggested improvements will enhance the maintainability and performance of the codebase.

imran-siddique merged commit 47c3505 into microsoft:main Apr 2, 2026
18 of 21 checks passed

github-actions bot added the size/M Medium PR (< 200 lines) label Apr 2, 2026

github-actions bot reviewed Apr 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): remaining findings - workflow_run, Docker tags, dep confusion script#690

fix(security): remaining findings - workflow_run, Docker tags, dep confusion script#690
imran-siddique merged 1 commit intomicrosoft:mainfrom
imran-siddique:fix/security-audit-remaining

imran-siddique commented Apr 2, 2026

Uh oh!

Uh oh!

github-actions bot commented Apr 2, 2026

Security Review of Pull Request

1. Prompt Injection Defense Bypass

2. Policy Engine Circumvention

3. Trust Chain Weaknesses

4. Credential Exposure

5. Sandbox Escape

6. Deserialization Attacks

7. Race Conditions

8. Supply Chain

Summary of Findings

Recommendations

Uh oh!

github-actions bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

imran-siddique commented Apr 2, 2026

Security Audit - Batch 6: Remaining Medium Findings

Changes

Files Changed (5)

Uh oh!

Uh oh!

github-actions bot commented Apr 2, 2026

Security Review of Pull Request

1. Prompt Injection Defense Bypass

2. Policy Engine Circumvention

3. Trust Chain Weaknesses

4. Credential Exposure

5. Sandbox Escape

6. Deserialization Attacks

7. Race Conditions

8. Supply Chain

Summary of Findings

Recommendations

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

🤖 AI Agent: code-reviewer

Review Summary

🔴 CRITICAL

🟡 WARNING

💡 SUGGESTIONS

Final Assessment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant