You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(audit): defensive redaction pass on log + manifest writes (#148)
Closes hardening item #5 from "Secret Handling & Credential Surface
Hardening" in docs/developer/SDK-ROADMAP.md. Adds a recursive walker
that masks every string in the audit trail at serialization time,
plus fixes a pre-existing bug where masking record.msg corrupted %s
format strings whose placeholders matched the token: pattern.
argus/audit/secrets.py:
- New mask_secrets_in_obj(obj) walker. Recurses through dicts,
lists, tuples; applies mask_secrets to every string value; leaves
keys and non-string scalars untouched; returns a new structure
(does NOT mutate the input).
argus/audit/logger.py:
- JsonLogFormatter.format: stop mutating record.msg. Mask the
rendered record.getMessage() instead — catches secrets passed as
printf-style args (logger.info("token: %s", real_token)) which
the prior approach missed because record.msg held the format
string, not the rendered output. Then walk the assembled JSON
entry through mask_secrets_in_obj so extra fields a contributor
might add to the formatter also get masked.
- ColoredConsoleFormatter.format: same fix — mask the rendered
message, not record.msg. Without this fix, "token: %s" matched
the token=/token: pattern and was rewritten to "token: <REDACTED>",
then record.getMessage() raised TypeError trying to substitute
args into a format string with no %s placeholder. Bug had been
silently masked because no test exercised the printf path.
argus/audit/manifest.py:
- AuditManifest.save: walk asdict(self) through mask_secrets_in_obj
before json.dumps. Defense-in-depth: today's manifest schema
doesn't include credential fields, but if a future field captures
a docker_cmd, env dict, or credential-shaped argv it gets masked
before hitting argus-audit.json.
Design note (vs. roadmap text):
The roadmap entry suggested reusing core/redact.redact_high_risk_patterns
(the vendor-prefix-only set used by Finding.__post_init__). The
existing audit/secrets.mask_secrets already covers that surface plus
broader patterns appropriate for log lines (token=, password=,
Bearer, URL creds, sk-keys). Extending audit/secrets keeps the
redactor co-located with its callers — easier to reason about and
no cross-module hop at hot-path serialization time.
Test coverage (19 new):
- argus/tests/audit/test_secrets.py::TestMaskSecretsInObj — 10 tests:
root scalar, dict value, nested dict, list, tuple, scalar passthrough,
no-mutation guard, deeply nested mix, dict-key preservation, unknown
type passthrough.
- argus/tests/audit/test_logger.py::TestJsonLogSecretLeakProtection —
4 tests: format-string secret, record.args secret (the regression
fix), extra-field secret, non-secret strings preserved unchanged.
- argus/tests/audit/test_manifest.py::TestManifestSecretLeakProtection —
5 tests: phase error, artifact path, nested dict at depth 4, input
not mutated after save, clean-manifest false-positive guard.
.ai/architecture.yaml: new audit/ entry in both SDK structure blocks
documenting the redaction posture, the walker, and the rendered-message
masking rationale.
Full suite: 3126 passed (+19 new), 2 skipped.
Co-authored-by: eFAILution <eFAILution@users.noreply.github.com>
Copy file name to clipboardExpand all lines: .ai/architecture.yaml
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -49,6 +49,7 @@ components:
49
49
"linters/": "Linter modules implementing Scanner protocol (LINTER_REGISTRY auto-merges into SCANNER_REGISTRY)"
50
50
"reporters/": "Output reporters (terminal, markdown, sarif, json, github, gitlab, junit). Discovered via the ``argus.reporters`` Python entry-point group (built-ins declared in pyproject.toml; third-party packages register additional formats without forking — see docs/contributing-reporters.md and ADR-023)."
"audit/": "Structured audit trail for every scan run. logger.py emits JSONL log records (one per line) into ``argus-results/.../argus.log``; manifest.py writes the per-run AuditManifest summary into ``argus-audit.json``. secrets.py provides mask_secrets (regex masking for token=, password=, Bearer, URL-creds, GitHub PATs, AWS access keys, sk-keys) and the mask_secrets_in_obj recursive walker. Both write paths run mask_secrets_in_obj at serialization time as defense-in-depth — if a future contributor accidentally captures a docker_cmd / env dict / credential-shaped argv into a manifest field or log entry, the redaction pass catches it before the file lands. Both logger formatters mask the rendered ``record.getMessage()`` rather than ``record.msg`` to avoid corrupting %s format strings whose placeholders match secret-shaped patterns."
0 commit comments