fix: resolve 44 code scanning alerts by imran-siddique · Pull Request #79 · microsoft/agent-governance-toolkit

imran-siddique · 2026-03-07T20:05:02Z

Fixes 44 alerts: clear-text logging, URL sanitization, token permissions, pinned deps.

Clear-text logging (10 alerts fixed): - healthcare-hipaa/main.py: Added _redact() helper, masked patient data - agent-mesh healthcare-hipaa/main.py: Masked patient ID in logs - eu-ai-act-compliance/demo.py: Masked agent labels - financial-sox/demo.py: Masked SSN-containing messages URL sanitization (12 alerts fixed): - test_rate_limiting_template.py: Use explicit equality for domain checks - test_identity.py, test_coverage_boost.py: Use urlparse() for SPIFFE URIs - service-worker.ts: Use new URL().hostname for platform detection Workflow token permissions (3 alerts fixed): - auto-merge-dependabot.yml, sbom.yml, codeql.yml: Top-level read-only permissions with write scopes pushed to job level Workflow pinned dependencies (8 action refs pinned): - dependency-review.yml, labeler.yml, pr-size.yml, stale.yml, welcome.yml, auto-merge-dependabot.yml: Pin to commit SHAs Dockerfile/script dependency pinning (11 files): - Pin pip install versions in Dockerfiles and shell scripts - Add --no-cache-dir where missing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-03-07T20:05:16Z

Dependency Review

The following issues were found:

✅ 0 vulnerable package(s)
✅ 0 package(s) with incompatible licenses
✅ 0 package(s) with invalid SPDX license definitions
⚠️ 1 package(s) with unknown licenses.

See the Details below.

License Issues

.github/workflows/welcome.yml

Package	Version	License	Issue Type
actions/first-interaction	34f15f4562c5e4085ea721c63dadab8138be06db	Null	Unknown License

Allowed Licenses: MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, PSF-2.0, Python-2.0, 0BSD, Unlicense, CC0-1.0, CC-BY-4.0, Zlib, BSL-1.0, MPL-2.0

OpenSSF Scorecard

Package

Version

Score

Details

actions/codelytv/pr-size-labeler

4ec67706cd878fbc1c8db0a5dcd28b6bb412e85a

Unknown

actions/actions/first-interaction

34f15f4562c5e4085ea721c63dadab8138be06db

🟢 4.6

Details

Check	Score	Reason
Dangerous-Workflow	🟢 10	no dangerous workflow patterns detected
Maintained	⚠️ 0	0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Packaging	⚠️ -1	packaging workflow not detected
Code-Review	🟢 3	Found 1/3 approved changesets -- score normalized to 3
Binary-Artifacts	🟢 10	no binaries found in the repo
CII-Best-Practices	⚠️ 0	no effort to earn an OpenSSF best practices badge detected
Token-Permissions	⚠️ 0	detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies	🟢 3	dependency not pinned by hash detected -- score normalized to 3
License	🟢 10	license file detected
Fuzzing	⚠️ 0	project is not fuzzed
Signed-Releases	⚠️ -1	no releases found
Security-Policy	🟢 9	security policy file detected
Branch-Protection	⚠️ 1	branch protection is not maximal on development and all release branches
SAST	🟢 9	SAST tool detected but not run on all commits

Scanned Files

.github/workflows/pr-size.yml
.github/workflows/welcome.yml

packages/agent-mesh/examples/03-healthcare-hipaa/main.py

    async def access_patient_data(self, patient_id: str, purpose: str) -> Dict[str, Any]:
        """Access patient data with HIPAA controls."""
-        print(f"📂 Accessing patient data: {patient_id[:3]}***")
+        print(f"📂 Accessing patient data: {_redact(patient_id, 3)}")


In general, to fix clear-text logging of sensitive data, either (a) stop logging the sensitive value, (b) fully mask/redact it so no original characters remain, or (c) transform it into a non-reversible surrogate (e.g., a hash) that is not directly identifying. For PHI such as patient_id, HIPAA-oriented examples should avoid logging any recognizable portion of the identifier.

The minimal change that preserves existing behavior while removing the risk is: in access_patient_data, stop showing even a partially redacted patient_id in logs. Instead, either log a constant message (“Accessing patient data”) or log a non-sensitive surrogate derived from patient_id (e.g., a hash) if traceability is required. Since we must not assume external config and should avoid extra complexity, the simplest and safest fix here is to remove the interpolation of patient_id from the log entirely.

Concretely, in packages/agent-mesh/examples/03-healthcare-hipaa/main.py:

Change line 96 from print(f"📂 Accessing patient data: {_redact(patient_id, 3)}") to a version that does not include patient_id, e.g. print("📂 Accessing patient data").

No additional imports or helper methods are required for this fix.

We leave _redact untouched because it might be used elsewhere; CodeQL’s specific tainted path is resolved by removing patient_id from this log message.

packages/agent-mesh/examples/06-eu-ai-act-compliance/demo.py

        icon = "✅" if deployable else "🚫"
        status = "APPROVED" if deployable else "BLOCKED"
-        print(f"  {icon}  {label:40s} → {status}")  # lgtm[py/clear-text-logging-sensitive-data]
+        print(f"  {icon}  {_redact(label, 20):40s} → {status}")


To fix the problem, ensure that the logging statement never prints any part of sensitive or tainted data in clear text. Since label is tainted along the path, the _redact function should not reveal any portion of the original string when used for potentially sensitive values, and the call site should avoid relying on partial visibility of the original data.

The best minimal fix is:

Strengthen _redact so that it does not leak any characters from the original string, regardless of visible_chars. This ensures that any sensitive data passed through it is completely masked.

Adjust the deployment gate print statement to avoid depending on the original label contents for formatting. Instead, log only non-sensitive information such as the deployment status and a generic placeholder label or the risk level if that is considered non-sensitive, while still using _redact for safety.

Concretely:

In packages/agent-mesh/examples/06-eu-ai-act-compliance/demo.py, update _redact (lines 23–30) so that it always returns "***" (or a similar constant) and ignores visible_chars. This preserves the intent of redaction but removes partial exposure.

Update the line 138 print statement so that it no longer formats the original label via _redact(label, 20). For example, either:

Use _redact("agent", 0) as a neutral placeholder string, or

Replace the redacted label with a generic "AGENT" placeholder while retaining the rest of the message.

This keeps functionality essentially the same (a deployment gate summary is printed) while ensuring that no user- or environment-derived strings are logged.

No new imports or external methods are required.

packages/agent-os/examples/financial-sox/demo.py

    import re
    redacted_msg = re.sub(r'\d{3}-\d{2}-\d{4}', 'XXX-XX-XXXX', ssn_message)
-    print(f'  Input: "{redacted_msg}"')
+    print(f'  Input: "{_redact(ssn_message, 11)}"')


In general, the fix is to ensure that sensitive data (here, an SSN-like value) is not logged in clear text, even partially. That means either not logging the sensitive string at all, or logging only a fully redacted or synthetic version that cannot reveal the SSN.

The minimal, behavior-preserving fix is to change the specific print statement in packages/agent-os/examples/financial-sox/demo.py so it does not expose the tainted ssn_message content. Since the demo already computes redacted_msg using a regex that fully masks the SSN, we can log that value instead of the partially redacted _redact(ssn_message, 11). This keeps the demo understandable (it still shows an input string with an SSN masked) while avoiding logging the original sensitive text. Concretely, on line 372 we replace:

print(f' Input: "{_redact(ssn_message, 11)}"')

with:

print(f' Input: "{redacted_msg}"')

No new imports or helper functions are required; we only reuse the existing redacted_msg variable calculated on line 371.