Skip to content

feat(scan): emit canonical argus-results.json + persist raw scanner output (source + container)#116

Merged
eFAILution merged 2 commits intofeat/argus-portabilityfrom
feat/container-output-completion
May 5, 2026
Merged

feat(scan): emit canonical argus-results.json + persist raw scanner output (source + container)#116
eFAILution merged 2 commits intofeat/argus-portabilityfrom
feat/container-output-completion

Conversation

@eFAILution
Copy link
Copy Markdown
Collaborator

@eFAILution eFAILution commented May 5, 2026

Summary

Three related fixes for the scan output contract:

  1. argus view doesn't display container vulnerabilities — the container-scan flow only wrote a domain-shaped container-scan.json (per-image counts, container_count, etc.) which the viewers don't know how to render. The viewers consume the canonical argus-results.json shape produced by source scans.
  2. The argus-results/ dir doesn't preserve the raw per-scanner output files — for container scans (trivy-results.json, grype-results.json, syft-sbom.json) and now for source scans too, the per-scanner outputs lived in tempdirs that got wiped at the end of each scan. Users who want forensics, audit trails, or manual triage had nowhere to look.
  3. Unify the raw-output config — initial scope only covered container scans. Rescoped this PR to cover source scans as well, so both flows share reporting.keep_raw and --no-keep-raw.

All three rooted in the same architectural drift: the container flow diverged from the source-scan output contract, and source scans never preserved per-scanner output at all. This PR re-aligns it.

Canonical ScanSummary for container scans

  • _cmd_container_scan now builds a canonical ScanSummary alongside the existing ContainerScanSummary: each container target maps to ScanResult(scanner=f"container/<name>", findings=combined, metadata={image_ref, build_success, scanner_errors, scan_error}).
  • The JSON reporter writes that to argus-results.json unconditionally (matches the source-scan canonical-artifact contract from PR feat(view): config-aware remediation when argus-results.json is missing #111).
  • The SARIF reporter consumes the same canonical summary instead of building a one-off conversion locally.
  • The domain-shaped container-scan.json is preserved for backward compat with downstream tooling.
  • argus view opens container scan results without any new code on the viewer side — it just sees ScanResult rows named container/<image>.

Raw scanner output persistence (both flows)

Container scansscan_image gains a raw_output_dir: Path | None parameter. When set, copies trivy-results.json, grype-results.json, and syft-sbom.json into that directory before the tempdir is cleaned up. ContainerEngine reads _raw_output_root from its config dict and threads a per-target subdir to scan_image as <root>/<target.name>/.

Source scans (rescoped addition)ArgusEngine.run() gains a raw_output_dir parameter. _run_in_container copies each scanner's result_files (results.json / *.sarif / stdout.txt) under <output_dir>/raw/<scanner>/ after the engine reads them. Same opt-in semantics, same on-disk shape — just the source-scan analogue.

Unified config--no-keep-raw is now a top-level scan flag (not container-only). reporting.keep_raw (default true) replaces the container-scoped containers.keep_raw. Both flows honor the same knob; CLI flag wins on conflict (explicit > implicit). 0-byte files are explicitly skipped on both sides — they're failure signals and would mislead readers if persisted.

Documentation

  • argus.example.yml documents the unified reporting.keep_raw: true knob with a comment explaining it covers both source and container flows.
  • The container-only containers.keep_raw example was removed.
  • docs/cli-reference.md regenerated to reflect the moved --no-keep-raw flag.

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update

Changes Made

  • Added raw_output_dir parameter to scan_image (container) and ArgusEngine.run() (source).
  • Added canonical ScanSummary build in _cmd_container_scan with container/<name> scanner-name convention.
  • Moved --no-keep-raw from container-only to top-level scan flag.
  • Added keep_raw: bool = True to ReportingConfig; deprecated containers.keep_raw.
  • Updated _load_container_config to surface reporting.keep_raw to the container handler.
  • Updated SARIF reporter to consume canonical ScanSummary for container output.
  • Updated argus.example.yml and docs/cli-reference.md accordingly.

Testing

  • 5 new container-side tests (TestScanImageRawOutputPersistence x4 + TestContainerCanonicalScanSummary x1).
  • 4 new source-side tests in TestRunInContainer: persists per-scanner files, opt-in on None, stdout fallback, 0-byte skip.
  • Full SDK suite: 1468 passed, 8 skipped (was 1464 before rescope).

Checklist

  • Code follows style guidelines
  • Self-review completed
  • Comments added for non-obvious decisions (config-key discovery, 0-byte skip rationale)
  • Documentation updated (argus.example.yml, CLI reference)
  • No new warnings
  • Tests added/updated
  • All tests passing locally
  • No merge conflicts

…ner output

Two related fixes for the container-scan flow, addressing user
reports that:

1. ``argus view`` doesn't display container vulnerabilities — the
   container-scan flow only wrote a domain-shaped
   ``container-scan.json`` (per-image counts, ``container_count``,
   etc.) which the viewers don't know how to render. The viewers
   consume the canonical ``argus-results.json`` shape produced by
   source scans.
2. The ``argus-results/`` dir doesn't preserve the raw per-scanner
   output files (``trivy-results.json``, ``grype-results.json``,
   ``syft-sbom.json``) — they live in a tempdir that gets wiped at
   the end of ``scan_image``. Users who want forensics, audit
   trails, or manual triage have nowhere to look.

Both rooted in the same architectural drift: the container flow
diverged from the source-scan output contract. This PR re-aligns it.

Canonical ScanSummary for container scans
- ``_cmd_container_scan`` now builds a canonical ``ScanSummary``
  alongside the existing ``ContainerScanSummary``: each container
  target maps to ``ScanResult(scanner=f"container/<name>",
  findings=combined, metadata={image_ref, build_success,
  scanner_errors, scan_error})``.
- The JSON reporter writes that to ``argus-results.json``
  unconditionally (matches the source-scan canonical-artifact
  contract from PR #111).
- The SARIF reporter now consumes the same canonical summary
  instead of building a one-off conversion locally.
- The domain-shaped ``container-scan.json`` (with ``container_count``,
  per-image stats) is preserved for backward compat with downstream
  tooling that consumes it; it just lives alongside the canonical
  artifact rather than instead of it.
- ``argus view`` opens container scan results without any new code
  on the viewer side — it just sees ``ScanResult`` rows named
  ``container/<image>`` and renders them like any other scanner.

Raw scanner output persistence
- ``scan_image`` gains a ``raw_output_dir: Path | None`` parameter.
  When set, copies ``trivy-results.json``, ``grype-results.json``,
  and ``syft-sbom.json`` into that directory before the tempdir is
  cleaned up. Best-effort — copy errors log a warning but don't
  fail the scan.
- ``ContainerEngine`` reads ``_raw_output_root`` from its config
  dict (the dispatcher sets this) and threads a per-target subdir
  to ``scan_image`` as ``<root>/<target.name>/``.
- ``_cmd_container_scan`` defaults to ON: raw outputs land at
  ``<output_dir>/raw/<image>/``. Opt out via
  ``--no-keep-raw`` flag or ``containers.keep_raw: false`` in
  argus.yml. CLI flag wins on conflict (explicit > implicit).
- 0-byte files are explicitly skipped (they're failure signals
  upstream; persisting them would make a known-bad output look
  authoritative on disk).

Documentation
- ``argus.example.yml`` documents ``containers.keep_raw: true`` in
  the commented schema block, alongside the existing ``images``,
  ``discover``, and ``scanners`` keys.

Tests (+5)
- ``TestScanImageRawOutputPersistence`` (4 cases): all artifacts
  copied when dir supplied, no copy when ``raw_output_dir=None``,
  partial coverage (only trivy ran) doesn't block the others, 0-byte
  files are explicitly skipped.
- ``TestContainerCanonicalScanSummary`` (1 case): each
  ContainerScanResult maps to a canonical ScanResult(scanner=
  "container/<name>") with combined findings; metadata lifts onto
  the ScanResult; round-trips through ``ScanSummary.to_dict()``
  unchanged so the viewer gets the same shape it expects.

Validation
- Full SDK suite: 1464 passed (+5 net), 8 skipped.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 91.47727% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
argus/cli.py 35.29% 11 Missing ⚠️
argus/container/scanner.py 80.00% 2 Missing ⚠️
argus/core/engine.py 81.81% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

🔒 Argus Container Security Scan

Branch: feat/container-output-completion
Commit: 4947fc1

📊 Combined Findings Summary

🚨 Critical ⚠️ High 🟡 Medium 🔵 Low 📦 Total 🔢 Unique
1 22 61 64 148 148

Scanned: 4 containers | Build Failures: 0

📦 Container Breakdown

Container Image 🚨 Crit ⚠️ High 🟡 Med 🔵 Low Total Unique Status
cli ghcr.io/huntridge-labs/argus/cli:4947fc12df83037a92d2e4eafe8b31d0ad0d5452 1 11 15 1 28 28
scanner-bandit ghcr.io/huntridge-labs/argus/scanner-bandit:4947fc12df83037a92d2e4eafe8b31d0ad0d5452 0 0 1 0 1 1
scanner-opengrep ghcr.io/huntridge-labs/argus/scanner-opengrep:4947fc12df83037a92d2e4eafe8b31d0ad0d5452 0 7 41 63 111 111
scanner-supply-chain ghcr.io/huntridge-labs/argus/scanner-supply-chain:4947fc12df83037a92d2e4eafe8b31d0ad0d5452 0 4 4 0 8 8

🔍 Detailed Findings by Container

🚨 cli - 28 vulnerabilities (22 unique)

Image: ghcr.io/huntridge-labs/argus/cli:4947fc12df83037a92d2e4eafe8b31d0ad0d5452

Combined (Deduplicated)

🚨 Critical ⚠️ High 🟡 Medium 🔵 Low Total Unique
1 11 15 1 28 22
🔷 Trivy Scanner (28 findings, 22 unique)
CVE Severity Package Version Fixed
CVE-2025-68121 🚨 CRITICAL stdlib v1.24.11 1.24.13, 1.25.7, 1.26.0-rc.3
CVE-2026-32280 ⚠️ HIGH stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-32281 ⚠️ HIGH stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-32283 ⚠️ HIGH stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-33810 ⚠️ HIGH stdlib v1.26.1 1.26.2
CVE-2025-61726 ⚠️ HIGH stdlib v1.24.11 1.24.12, 1.25.6
CVE-2025-61728 ⚠️ HIGH stdlib v1.24.11 1.24.12, 1.25.6
CVE-2026-25679 ⚠️ HIGH stdlib v1.24.11 1.25.8, 1.26.1
CVE-2026-32280 ⚠️ HIGH stdlib v1.24.11 1.25.9, 1.26.2
CVE-2026-32281 ⚠️ HIGH stdlib v1.24.11 1.25.9, 1.26.2
CVE-2026-32283 ⚠️ HIGH stdlib v1.24.11 1.25.9, 1.26.2
CVE-2026-34040 ⚠️ HIGH github.com/docker/docker v28.5.2+incompatible 29.3.1
CVE-2026-3219 🟡 MEDIUM pip 26.0.1 N/A
CVE-2026-32282 🟡 MEDIUM stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-32288 🟡 MEDIUM stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-32289 🟡 MEDIUM stdlib v1.26.1 1.25.9, 1.26.2
CVE-2025-11579 🟡 MEDIUM github.com/nwaples/rardecode/v2 v2.1.0 2.2.0
CVE-2025-58058 🟡 MEDIUM github.com/ulikunitz/xz v0.5.12 0.5.15
CVE-2025-47914 🟡 MEDIUM golang.org/x/crypto v0.35.0 0.45.0
CVE-2025-58181 🟡 MEDIUM golang.org/x/crypto v0.35.0 0.45.0
CVE-2025-61730 🟡 MEDIUM stdlib v1.24.11 1.24.12, 1.25.6
CVE-2026-27142 🟡 MEDIUM stdlib v1.24.11 1.25.8, 1.26.1
CVE-2026-32282 🟡 MEDIUM stdlib v1.24.11 1.25.9, 1.26.2
CVE-2026-32288 🟡 MEDIUM stdlib v1.24.11 1.25.9, 1.26.2
CVE-2026-32289 🟡 MEDIUM stdlib v1.24.11 1.25.9, 1.26.2
CVE-2026-33997 🟡 MEDIUM github.com/docker/docker v28.5.2+incompatible 29.3.1
CVE-2026-41506 🟡 MEDIUM github.com/go-git/go-git/v5 v5.17.2 5.18.0
CVE-2026-27139 🔵 LOW stdlib v1.24.11 1.25.8, 1.26.1
⚓ Grype Scanner (0 findings, 0 unique)

✅ No vulnerabilities detected by Grype

🟡 scanner-bandit - 1 vulnerabilities (1 unique)

Image: ghcr.io/huntridge-labs/argus/scanner-bandit:4947fc12df83037a92d2e4eafe8b31d0ad0d5452

Combined (Deduplicated)

🚨 Critical ⚠️ High 🟡 Medium 🔵 Low Total Unique
0 0 1 0 1 1
🔷 Trivy Scanner (1 findings, 1 unique)
CVE Severity Package Version Fixed
CVE-2026-3219 🟡 MEDIUM pip 26.0.1 N/A
⚓ Grype Scanner (0 findings, 0 unique)

✅ No vulnerabilities detected by Grype

⚠️ scanner-opengrep - 113 vulnerabilities (49 unique)

Image: ghcr.io/huntridge-labs/argus/scanner-opengrep:4947fc12df83037a92d2e4eafe8b31d0ad0d5452

Combined (Deduplicated)

🚨 Critical ⚠️ High 🟡 Medium 🔵 Low Total Unique
0 7 41 63 113 49
🔷 Trivy Scanner (113 findings, 48 unique)
CVE Severity Package Version Fixed
CVE-2026-4878 ⚠️ HIGH libcap2 1:2.75-10+b8 N/A
CVE-2025-69720 ⚠️ HIGH libncursesw6 6.5+20250216-2 N/A
CVE-2026-29111 ⚠️ HIGH libsystemd0 257.9-1~deb13u1 N/A
CVE-2025-69720 ⚠️ HIGH libtinfo6 6.5+20250216-2 N/A
CVE-2026-29111 ⚠️ HIGH libudev1 257.9-1~deb13u1 N/A
CVE-2025-69720 ⚠️ HIGH ncurses-base 6.5+20250216-2 N/A
CVE-2025-69720 ⚠️ HIGH ncurses-bin 6.5+20250216-2 N/A
CVE-2026-27456 🟡 MEDIUM bsdutils 1:2.41-5 N/A
CVE-2026-3184 🟡 MEDIUM bsdutils 1:2.41-5 N/A
CVE-2026-27456 🟡 MEDIUM libblkid1 2.41-5 N/A
CVE-2026-3184 🟡 MEDIUM libblkid1 2.41-5 N/A
CVE-2026-4046 🟡 MEDIUM libc-bin 2.41-12+deb13u2 N/A
CVE-2026-4437 🟡 MEDIUM libc-bin 2.41-12+deb13u2 N/A
CVE-2026-4438 🟡 MEDIUM libc-bin 2.41-12+deb13u2 N/A
CVE-2026-5435 🟡 MEDIUM libc-bin 2.41-12+deb13u2 N/A
CVE-2026-5450 🟡 MEDIUM libc-bin 2.41-12+deb13u2 N/A
CVE-2026-5928 🟡 MEDIUM libc-bin 2.41-12+deb13u2 N/A
CVE-2026-4046 🟡 MEDIUM libc6 2.41-12+deb13u2 N/A
CVE-2026-4437 🟡 MEDIUM libc6 2.41-12+deb13u2 N/A
CVE-2026-4438 🟡 MEDIUM libc6 2.41-12+deb13u2 N/A
CVE-2026-5435 🟡 MEDIUM libc6 2.41-12+deb13u2 N/A
CVE-2026-5450 🟡 MEDIUM libc6 2.41-12+deb13u2 N/A
CVE-2026-5928 🟡 MEDIUM libc6 2.41-12+deb13u2 N/A
CVE-2026-27456 🟡 MEDIUM liblastlog2-2 2.41-5 N/A
CVE-2026-3184 🟡 MEDIUM liblastlog2-2 2.41-5 N/A
CVE-2026-34743 🟡 MEDIUM liblzma5 5.8.1-1 N/A
CVE-2026-27456 🟡 MEDIUM libmount1 2.41-5 N/A
CVE-2026-3184 🟡 MEDIUM libmount1 2.41-5 N/A
CVE-2026-27456 🟡 MEDIUM libsmartcols1 2.41-5 N/A
CVE-2026-3184 🟡 MEDIUM libsmartcols1 2.41-5 N/A
CVE-2026-40225 🟡 MEDIUM libsystemd0 257.9-1~deb13u1 N/A
CVE-2026-40226 🟡 MEDIUM libsystemd0 257.9-1~deb13u1 N/A
CVE-2026-4105 🟡 MEDIUM libsystemd0 257.9-1~deb13u1 N/A
CVE-2026-40225 🟡 MEDIUM libudev1 257.9-1~deb13u1 N/A
CVE-2026-40226 🟡 MEDIUM libudev1 257.9-1~deb13u1 N/A
CVE-2026-4105 🟡 MEDIUM libudev1 257.9-1~deb13u1 N/A
CVE-2026-27456 🟡 MEDIUM libuuid1 2.41-5 N/A
CVE-2026-3184 🟡 MEDIUM libuuid1 2.41-5 N/A
CVE-2026-27456 🟡 MEDIUM login 1:4.16.0-2+really2.41-5 N/A
CVE-2026-3184 🟡 MEDIUM login 1:4.16.0-2+really2.41-5 N/A
CVE-2026-27456 🟡 MEDIUM mount 2.41-5 N/A
CVE-2026-3184 🟡 MEDIUM mount 2.41-5 N/A
CVE-2026-5958 🟡 MEDIUM sed 4.9-2 N/A
CVE-2026-5704 🟡 MEDIUM tar 1.35+dfsg-3.1 N/A
CVE-2026-27456 🟡 MEDIUM util-linux 2.41-5 N/A
CVE-2026-3184 🟡 MEDIUM util-linux 2.41-5 N/A
CVE-2026-27171 🟡 MEDIUM zlib1g 1:1.3.dfsg+really1.3.1-1+b1 N/A
CVE-2026-3219 🟡 MEDIUM pip 26.0.1 N/A
CVE-2011-3374 🔵 LOW apt 3.0.3 N/A
TEMP-0841856-B18BAF 🔵 LOW bash 5.2.37-2+b8 N/A

...and 63 more

⚓ Grype Scanner (0 findings, 0 unique)

✅ No vulnerabilities detected by Grype

⚠️ scanner-supply-chain - 8 vulnerabilities (8 unique)

Image: ghcr.io/huntridge-labs/argus/scanner-supply-chain:4947fc12df83037a92d2e4eafe8b31d0ad0d5452

Combined (Deduplicated)

🚨 Critical ⚠️ High 🟡 Medium 🔵 Low Total Unique
0 4 4 0 8 8
🔷 Trivy Scanner (8 findings, 8 unique)
CVE Severity Package Version Fixed
CVE-2026-32280 ⚠️ HIGH stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-32281 ⚠️ HIGH stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-32283 ⚠️ HIGH stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-33810 ⚠️ HIGH stdlib v1.26.1 1.26.2
CVE-2026-3219 🟡 MEDIUM pip 26.0.1 N/A
CVE-2026-32282 🟡 MEDIUM stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-32288 🟡 MEDIUM stdlib v1.26.1 1.25.9, 1.26.2
CVE-2026-32289 🟡 MEDIUM stdlib v1.26.1 1.25.9, 1.26.2
⚓ Grype Scanner (0 findings, 0 unique)

✅ No vulnerabilities detected by Grype


Generated by Argus

Extend the raw-output preservation already in place for container
scans to cover source scans. ArgusEngine.run() now accepts
raw_output_dir and copies each scanner's results.json / *.sarif /
stdout.txt under <output_dir>/raw/<scanner>/ alongside the
canonical argus-results.json — the same posture as the container
flow, so forensics and manual triage have the same surface area
regardless of which scan path produced the findings.

The CLI gains a unified --no-keep-raw flag (moved out of the
container-only group) and reporting.keep_raw replaces the
container-scoped containers.keep_raw key. CLI flag wins on
conflict; default remains keep-raw=true.
@eFAILution eFAILution changed the title feat(container): emit canonical argus-results.json + persist raw scanner output feat(scan): emit canonical argus-results.json + persist raw scanner output (source + container) May 5, 2026
@eFAILution eFAILution merged commit 8d184d6 into feat/argus-portability May 5, 2026
21 checks passed
@eFAILution eFAILution deleted the feat/container-output-completion branch May 5, 2026 21:58
eFAILution added a commit that referenced this pull request May 5, 2026
Five rolled-up updates to keep .ai/ accurate against the current SDK + CLI architecture:

1. workflows.yaml — rewrite add_new_scanner SDK-first. New steps:
   create argus/scanners/{name}.py implementing the Scanner protocol,
   register in SCANNER_REGISTRY, add Dockerfile only when no upstream
   image exists (ADR-014), add tests including the secret-leak audit
   regression test, verify with `argus scan {name}`. Composite-action
   wrapper is now an optional follow-on, mirroring the add_new_linter
   shape and CONTRIBUTING.md sequence. Refresh local_scanner_test to
   use `argus scan` instead of the old per-scanner parse-results.py
   pipe.

2. context.yaml — bump version 0.7.0 → 0.7.2. Lead with SDK in
   one_liner. Expand entrypoints with the full CLI surface (init,
   list, validate, view, report, completion, mcp, cache) and viewer
   extras. Add Scanner Protocol, SCANNER_REGISTRY, Reporter,
   ScanSummary, MCP glossary entries.

3. architecture.yaml — bump version. Replace the wrong "stdlib +
   pyyaml only" dependencies line with the actual dep set (pyyaml,
   click, jsonschema, rich + optional textual/fastapi/mcp extras).
   Refresh data_flow with the source-scan, container-scan, MCP, and
   thin-composite pipelines, and document the canonical
   argus-results.json contract + container/<name> ScanResult naming.
   Add the new CLI subcommands.

4. decisions.yaml — flip ADR-013 (SDK + CLI) and ADR-015 (agentic
   substrate) from proposed → accepted; both shipped months ago. Add
   ADR-018 capturing the canonical argus-results.json artifact
   contract, container/<name> scanner naming convention, and unified
   reporting.keep_raw / --no-keep-raw raw-output preservation from
   PRs #111 and #116.

5. errors.yaml — drop the hardcoded scanner enum (incomplete; missed
   supply-chain and lint-*). Point at `argus list` and tab completion
   as the live source of truth. Update the GHES alternative to
   `pip install argus-security && argus scan` (was the old
   pyyaml-only install).

6. docs/scanners.md — replace `pip install pyyaml` /
   `python -m argus scan` with `pip install argus-security` /
   `argus scan` to match current SDK install instructions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant