refactor(scanners): unify tool_version + scan + build_args into a single SDK pattern#117
Conversation
Each scanner previously rolled its own ~17-line tool_version() that ran <tool> --version, swallowed all exceptions, and parsed the output with ad-hoc string slicing. The parsers had drifted (some checked first line, some last; some stripped a leading v, some didn't), the exception lists had drifted (except (TimeoutExpired, FileNotFoundError, Exception) — where Exception already covers the others), and the timeouts had drifted between 5s and 10s for no clear reason. Extract argus.core.version.parse_tool_version(cmd, regex, *, group=1, timeout=5.0) which: - Runs the subprocess and narrowly handles missing-binary, timeout, and OSError (returning None). ADR-016: bugs in subprocess.run itself surface, not silently translate to None. - Applies the regex with re.MULTILINE so anchors like ^ behave predictably across multi-line tool banners. - Falls back to stderr when stdout is empty (some Java tools). Each scanner's tool_version() shrinks to a 1-line return: return parse_tool_version(["bandit", "--version"], r"^bandit (\S+)") Refactored: bandit, clamav, gitleaks, opengrep, trivy, trivy_iac, osv, supply_chain, checkov, hadolint. Grype keeps custom JSON parsing (grype version -o json doesn't fit a single regex) but tightens its exception clause from `except (TimeoutExpired, JSONDecodeError, Exception)` to the narrowly-named exception types. Documented in CONTRIBUTING.md (Adding a Scanner via the SDK) and .ai/workflows.yaml (add_new_scanner) so the next contributor — human or AI — sees the helper as the documented pattern, not just a thing they can grep for. Drive-by simplifications discovered during the same complexity review: - argus.core.models.Severity: replace 4 hand-rolled comparison methods with @functools.total_ordering + __lt__ (-16 lines). - argus.init._check_local_readiness: narrow `except Exception` to `except ImportError`. The broad catch hid bugs in the readiness logic itself behind a generic "no readiness shown" display; per ADR-016 those should surface. Test updated to assert the new loud-on-real-bug posture. - argus.init.run_init banner: drop the per-line time.sleep scroll effect (~1-2s of cosmetic delay on every interactive `argus init`).
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
🔒 Argus Container Security ScanBranch: 📊 Combined Findings Summary
Scanned: 4 containers | Build Failures: 0 📦 Container Breakdown
🔍 Detailed Findings by Container🚨 cli - 28 vulnerabilities (22 unique)Image: Combined (Deduplicated)
🔷 Trivy Scanner (28 findings, 22 unique)
⚓ Grype Scanner (0 findings, 0 unique)✅ No vulnerabilities detected by Grype 🟡 scanner-bandit - 1 vulnerabilities (1 unique)Image: Combined (Deduplicated)
🔷 Trivy Scanner (1 findings, 1 unique)
⚓ Grype Scanner (0 findings, 0 unique)✅ No vulnerabilities detected by Grype
|
| 🚨 Critical | 🟡 Medium | 🔵 Low | Total | Unique | |
|---|---|---|---|---|---|
| 0 | 7 | 41 | 63 | 113 | 49 |
🔷 Trivy Scanner (113 findings, 48 unique)
| CVE | Severity | Package | Version | Fixed |
|---|---|---|---|---|
| CVE-2026-4878 | libcap2 | 1:2.75-10+b8 | N/A | |
| CVE-2025-69720 | libncursesw6 | 6.5+20250216-2 | N/A | |
| CVE-2026-29111 | libsystemd0 | 257.9-1~deb13u1 | N/A | |
| CVE-2025-69720 | libtinfo6 | 6.5+20250216-2 | N/A | |
| CVE-2026-29111 | libudev1 | 257.9-1~deb13u1 | N/A | |
| CVE-2025-69720 | ncurses-base | 6.5+20250216-2 | N/A | |
| CVE-2025-69720 | ncurses-bin | 6.5+20250216-2 | N/A | |
| CVE-2026-27456 | 🟡 MEDIUM | bsdutils | 1:2.41-5 | N/A |
| CVE-2026-3184 | 🟡 MEDIUM | bsdutils | 1:2.41-5 | N/A |
| CVE-2026-27456 | 🟡 MEDIUM | libblkid1 | 2.41-5 | N/A |
| CVE-2026-3184 | 🟡 MEDIUM | libblkid1 | 2.41-5 | N/A |
| CVE-2026-4046 | 🟡 MEDIUM | libc-bin | 2.41-12+deb13u2 | N/A |
| CVE-2026-4437 | 🟡 MEDIUM | libc-bin | 2.41-12+deb13u2 | N/A |
| CVE-2026-4438 | 🟡 MEDIUM | libc-bin | 2.41-12+deb13u2 | N/A |
| CVE-2026-5435 | 🟡 MEDIUM | libc-bin | 2.41-12+deb13u2 | N/A |
| CVE-2026-5450 | 🟡 MEDIUM | libc-bin | 2.41-12+deb13u2 | N/A |
| CVE-2026-5928 | 🟡 MEDIUM | libc-bin | 2.41-12+deb13u2 | N/A |
| CVE-2026-4046 | 🟡 MEDIUM | libc6 | 2.41-12+deb13u2 | N/A |
| CVE-2026-4437 | 🟡 MEDIUM | libc6 | 2.41-12+deb13u2 | N/A |
| CVE-2026-4438 | 🟡 MEDIUM | libc6 | 2.41-12+deb13u2 | N/A |
| CVE-2026-5435 | 🟡 MEDIUM | libc6 | 2.41-12+deb13u2 | N/A |
| CVE-2026-5450 | 🟡 MEDIUM | libc6 | 2.41-12+deb13u2 | N/A |
| CVE-2026-5928 | 🟡 MEDIUM | libc6 | 2.41-12+deb13u2 | N/A |
| CVE-2026-27456 | 🟡 MEDIUM | liblastlog2-2 | 2.41-5 | N/A |
| CVE-2026-3184 | 🟡 MEDIUM | liblastlog2-2 | 2.41-5 | N/A |
| CVE-2026-34743 | 🟡 MEDIUM | liblzma5 | 5.8.1-1 | N/A |
| CVE-2026-27456 | 🟡 MEDIUM | libmount1 | 2.41-5 | N/A |
| CVE-2026-3184 | 🟡 MEDIUM | libmount1 | 2.41-5 | N/A |
| CVE-2026-27456 | 🟡 MEDIUM | libsmartcols1 | 2.41-5 | N/A |
| CVE-2026-3184 | 🟡 MEDIUM | libsmartcols1 | 2.41-5 | N/A |
| CVE-2026-40225 | 🟡 MEDIUM | libsystemd0 | 257.9-1~deb13u1 | N/A |
| CVE-2026-40226 | 🟡 MEDIUM | libsystemd0 | 257.9-1~deb13u1 | N/A |
| CVE-2026-4105 | 🟡 MEDIUM | libsystemd0 | 257.9-1~deb13u1 | N/A |
| CVE-2026-40225 | 🟡 MEDIUM | libudev1 | 257.9-1~deb13u1 | N/A |
| CVE-2026-40226 | 🟡 MEDIUM | libudev1 | 257.9-1~deb13u1 | N/A |
| CVE-2026-4105 | 🟡 MEDIUM | libudev1 | 257.9-1~deb13u1 | N/A |
| CVE-2026-27456 | 🟡 MEDIUM | libuuid1 | 2.41-5 | N/A |
| CVE-2026-3184 | 🟡 MEDIUM | libuuid1 | 2.41-5 | N/A |
| CVE-2026-27456 | 🟡 MEDIUM | login | 1:4.16.0-2+really2.41-5 | N/A |
| CVE-2026-3184 | 🟡 MEDIUM | login | 1:4.16.0-2+really2.41-5 | N/A |
| CVE-2026-27456 | 🟡 MEDIUM | mount | 2.41-5 | N/A |
| CVE-2026-3184 | 🟡 MEDIUM | mount | 2.41-5 | N/A |
| CVE-2026-5958 | 🟡 MEDIUM | sed | 4.9-2 | N/A |
| CVE-2026-5704 | 🟡 MEDIUM | tar | 1.35+dfsg-3.1 | N/A |
| CVE-2026-27456 | 🟡 MEDIUM | util-linux | 2.41-5 | N/A |
| CVE-2026-3184 | 🟡 MEDIUM | util-linux | 2.41-5 | N/A |
| CVE-2026-27171 | 🟡 MEDIUM | zlib1g | 1:1.3.dfsg+really1.3.1-1+b1 | N/A |
| CVE-2026-3219 | 🟡 MEDIUM | pip | 26.0.1 | N/A |
| CVE-2011-3374 | 🔵 LOW | apt | 3.0.3 | N/A |
| TEMP-0841856-B18BAF | 🔵 LOW | bash | 5.2.37-2+b8 | N/A |
...and 63 more
⚓ Grype Scanner (0 findings, 0 unique)
✅ No vulnerabilities detected by Grype
⚠️ scanner-supply-chain - 8 vulnerabilities (8 unique)
Image: ghcr.io/huntridge-labs/argus/scanner-supply-chain:3836b25471d0ebfe92f71965214c9efa72e67297
Combined (Deduplicated)
| 🚨 Critical | 🟡 Medium | 🔵 Low | Total | Unique | |
|---|---|---|---|---|---|
| 0 | 4 | 4 | 0 | 8 | 8 |
🔷 Trivy Scanner (8 findings, 8 unique)
| CVE | Severity | Package | Version | Fixed |
|---|---|---|---|---|
| CVE-2026-32280 | stdlib | v1.26.1 | 1.25.9, 1.26.2 | |
| CVE-2026-32281 | stdlib | v1.26.1 | 1.25.9, 1.26.2 | |
| CVE-2026-32283 | stdlib | v1.26.1 | 1.25.9, 1.26.2 | |
| CVE-2026-33810 | stdlib | v1.26.1 | 1.26.2 | |
| CVE-2026-3219 | 🟡 MEDIUM | pip | 26.0.1 | N/A |
| CVE-2026-32282 | 🟡 MEDIUM | stdlib | v1.26.1 | 1.25.9, 1.26.2 |
| CVE-2026-32288 | 🟡 MEDIUM | stdlib | v1.26.1 | 1.25.9, 1.26.2 |
| CVE-2026-32289 | 🟡 MEDIUM | stdlib | v1.26.1 | 1.25.9, 1.26.2 |
⚓ Grype Scanner (0 findings, 0 unique)
✅ No vulnerabilities detected by Grype
Generated by Argus
Establish the second half of the SDK scanner pattern. Each scanner
previously rolled three near-identical, drift-prone shapes:
1. ``scan()`` — 30-line tempdir + subprocess + returncode-check +
output-file-check + parse + ScanResult-build boilerplate.
2. ``_build_command(path, output, config)`` — local-execution argv.
3. ``container_args(config)`` — container-execution argv, hand-mirrored
from _build_command.
The two argv methods had drifted measurably — OSV's _build_command was
on the v1 ``--sbom`` flag while container_args used v2 ``-L``;
opengrep's local path used ``--output`` while the container path
emitted to ``--output-file``; bandit's exit-code semantics differed.
The dual structure made every drift invisible in code review.
This commit adds two pieces of scaffolding:
* ``argus.core.scanner_template.ScanPaths`` — frozen dataclass with
``workspace`` and ``output``. Local execution sets host paths;
container execution sets ``/workspace`` and ``/output/<file>``. The
scanner doesn't care which.
* ``argus.core.scanner_template.run_subprocess_scan(scanner, path,
config)`` — runs ``scanner.build_args(paths, config)`` in a tempdir,
reads the output file, builds a ScanResult. Falls back to stdout
capture when no output file is produced. Narrowly handles missing
binaries / timeouts / OSError per ADR-016 — bugs in
parse_results propagate.
The scanner protocol grows one method:
def build_args(self, paths: ScanPaths, config: dict) -> list[str]:
return ["my-tool", "scan", paths.workspace,
"--format", "json",
"--output", paths.output]
…which replaces the parallel _build_command + container_args pair.
Single source of truth for argv. The engine drops argv[0] when the
class declares ``container_entrypoint = "<bin>"`` (the image has
``ENTRYPOINT``), so the same method works for both execution paths.
Migrated to the new pattern: bandit, gitleaks, opengrep, trivy_iac,
checkov, osv. Each scanner's body shrinks by ~100 lines.
Engine ``_run_in_container`` builds a ScanPaths and calls
``scanner.build_args(paths, config)`` when present, falling back to
the legacy ``container_args(config)`` method only for scanners not
yet migrated (clamav, container, zap, trivy, grype, supply_chain).
The legacy branch will be dropped in a follow-up once the remaining
scanners — which all have structurally-different flows (text output,
multi-tool orchestration, Docker-only) — are dealt with on a
case-by-case basis.
Documentation updated:
* CONTRIBUTING.md "Adding a Scanner via the SDK" rewritten with the
new ~50-line example scanner module and an updated protocol table.
Explicitly calls out when to skip the template (grype, clamav,
supply_chain shapes).
* .ai/workflows.yaml::add_new_scanner gains a ``scan_template`` and
``build_args`` entry under the helpers block so the next
contributor — human or AI — sees both helpers as the documented
pattern.
Side fix: OSV's local path now uses osv-scanner v2 syntax (``-L``
plus ``--output-file``) consistently with the container path, instead
of the deprecated v1 ``--sbom`` and ``--output`` flags it had drifted
to.
Tests: 1497 passed (was 1495 before this commit; +14 template-helper
tests, -2 from removing dead-method tests, +0 from net OSV migration).
Total net: -193 lines across 12 files.
Add a ``containers:`` block listing argus's own four images
(scanner-bandit, scanner-opengrep, scanner-supply-chain, cli) with
their Dockerfile paths so ``argus scan container --config argus.yml``
runs the full trivy + grype + syft sweep without needing CLI flags.
Validated end-to-end against the local checkout:
- argus scan --config argus.yml: 6 scanners, 163 findings (mostly LOW
B404/B603 subprocess warnings, by design for a security tool that
wraps subprocess CLIs).
- argus scan container --config argus.yml: 4 images built locally,
230s wall, 141 total findings (98 unique).
Sets up the .github/workflows/build-containers.yml simplification
follow-up where the scan job's hand-rolled trivy-action + grype-action
+ inline Python glue (~80 lines) collapses to a single
``argus scan container --config argus.yml`` invocation, with argus.yml
as the single source of truth for the image list.
Scanners with custom ``scan()`` flows that don't fit the standard ``build_args(ScanPaths) -> list[str]`` contract (linters that walk the workspace and invoke their tool per file) used to AttributeError inside ``_run_in_container`` when the engine routed them through the container path. Combined with PR #117's silent-drop loophole, that made ``lint-dockerfile`` disappear from canonical results entirely when hadolint was not installed locally. Engine change: in ``_run_scanner``, the auto/docker branch now checks for ``build_args`` or ``container_args`` before entering ``_run_in_container``. When neither is present: * backend=auto: log a debug message and fall through to the local path (which calls ``scanner.scan(path, config)`` directly). * backend=docker: raise a clear RuntimeError naming the constraint ("scanner has container_image but no build_args/container_args method") so users know to either implement build_args or relax the backend. HadolintLinter cleanup: collapse the per-Dockerfile subprocess loop into a single ``hadolint --format json file1 file2 ...`` invocation. Hadolint accepts multiple paths and emits one combined JSON array with each finding's source ``file`` field intact, so the parser still produces correct ``location`` strings without threading the path back through the caller. Drops one process spawn per Dockerfile on every scan. Roadmap: docs/developer/SDK-ROADMAP.md adds a FileDiscoveryScanner template entry under Known Issues. The engine fallback gives every linter a working escape hatch today, but the real abstraction would be a base class that handles workspace walks, file globbing, container vs local routing, and batched invocation centrally so the six existing linters stop reimplementing it. Deferred until a second linter contributor copy-pastes the boilerplate; trigger to revisit documented inline. Two new regression tests in ``TestDockerExecutionBackend``: test_auto_backend_defers_to_scan_when_no_build_args (the lint-dockerfile fix path) and test_docker_backend_rejects_scanner_without_build_args (the loud error when the user opted into container-only). 1519 SDK tests pass.
Scanners with custom ``scan()`` flows that don't fit the standard ``build_args(ScanPaths) -> list[str]`` contract (linters that walk the workspace and invoke their tool per file) used to AttributeError inside ``_run_in_container`` when the engine routed them through the container path. Combined with PR #117's silent-drop loophole, that made ``lint-dockerfile`` disappear from canonical results entirely when hadolint was not installed locally. Engine change: in ``_run_scanner``, the auto/docker branch now checks for ``build_args`` or ``container_args`` before entering ``_run_in_container``. When neither is present: * backend=auto: log a debug message and fall through to the local path (which calls ``scanner.scan(path, config)`` directly). * backend=docker: raise a clear RuntimeError naming the constraint ("scanner has container_image but no build_args/container_args method") so users know to either implement build_args or relax the backend. HadolintLinter cleanup: collapse the per-Dockerfile subprocess loop into a single ``hadolint --format json file1 file2 ...`` invocation. Hadolint accepts multiple paths and emits one combined JSON array with each finding's source ``file`` field intact, so the parser still produces correct ``location`` strings without threading the path back through the caller. Drops one process spawn per Dockerfile on every scan. Roadmap: docs/developer/SDK-ROADMAP.md adds a FileDiscoveryScanner template entry under Known Issues. The engine fallback gives every linter a working escape hatch today, but the real abstraction would be a base class that handles workspace walks, file globbing, container vs local routing, and batched invocation centrally so the six existing linters stop reimplementing it. Deferred until a second linter contributor copy-pastes the boilerplate; trigger to revisit documented inline. Two new regression tests in ``TestDockerExecutionBackend``: test_auto_backend_defers_to_scan_when_no_build_args (the lint-dockerfile fix path) and test_docker_backend_rejects_scanner_without_build_args (the loud error when the user opted into container-only). 1519 SDK tests pass.
* fix(engine): defer to scanner.scan when build_args is missing Scanners with custom ``scan()`` flows that don't fit the standard ``build_args(ScanPaths) -> list[str]`` contract (linters that walk the workspace and invoke their tool per file) used to AttributeError inside ``_run_in_container`` when the engine routed them through the container path. Combined with PR #117's silent-drop loophole, that made ``lint-dockerfile`` disappear from canonical results entirely when hadolint was not installed locally. Engine change: in ``_run_scanner``, the auto/docker branch now checks for ``build_args`` or ``container_args`` before entering ``_run_in_container``. When neither is present: * backend=auto: log a debug message and fall through to the local path (which calls ``scanner.scan(path, config)`` directly). * backend=docker: raise a clear RuntimeError naming the constraint ("scanner has container_image but no build_args/container_args method") so users know to either implement build_args or relax the backend. HadolintLinter cleanup: collapse the per-Dockerfile subprocess loop into a single ``hadolint --format json file1 file2 ...`` invocation. Hadolint accepts multiple paths and emits one combined JSON array with each finding's source ``file`` field intact, so the parser still produces correct ``location`` strings without threading the path back through the caller. Drops one process spawn per Dockerfile on every scan. Roadmap: docs/developer/SDK-ROADMAP.md adds a FileDiscoveryScanner template entry under Known Issues. The engine fallback gives every linter a working escape hatch today, but the real abstraction would be a base class that handles workspace walks, file globbing, container vs local routing, and batched invocation centrally so the six existing linters stop reimplementing it. Deferred until a second linter contributor copy-pastes the boilerplate; trigger to revisit documented inline. Two new regression tests in ``TestDockerExecutionBackend``: test_auto_backend_defers_to_scan_when_no_build_args (the lint-dockerfile fix path) and test_docker_backend_rejects_scanner_without_build_args (the loud error when the user opted into container-only). 1519 SDK tests pass. * feat(linter): make HadolintLinter docker-aware so lint-dockerfile works without local hadolint The original framing in PR #120 was that ``lint-dockerfile`` requires the user to install hadolint locally OR wait for the FileDiscoveryScanner template. The user pushed back — argus has the official ``hadolint/hadolint:v2.14.0`` Docker image declared on the linter, so the engine should be using it instead of complaining the local binary is missing. Two changes make that work: 1. Engine: in the auto/no-build_args defer path, hand off to ``scanner.scan(path, config)`` unconditionally instead of falling through to the is_available() gate. Scanners without build_args are signaling that they own dispatch internally — including the choice between local execution and the docker fallback. The is_available() gate was preventing scan() from ever being called when the local binary was absent, even though scan() could have handled it. 2. HadolintLinter.scan(): when ``self.is_available()`` returns False, construct a ``docker run`` command against ``self.container_image`` instead of trying to invoke ``hadolint`` directly. Workspace mounts read-only at /workspace; discovered Dockerfile paths get translated to their /workspace/... equivalents. Hadolint accepts multiple file paths in one invocation, so the batched-call shape from the prior commit carries through cleanly. Bug along the way: the hadolint image has empty ENTRYPOINT and ``CMD = ["/bin/hadolint", "-"]``. Passing args at the end of ``docker run`` replaces CMD entirely, so the first arg becomes the command. Include the binary name explicitly as the first arg. Verified end-to-end against this repo's checkout: $ argus scan lint-dockerfile --severity-threshold none INFO DL3018 /workspace/docker/Dockerfile.cli:19 - Pin versions... INFO DL4006 /workspace/docker/Dockerfile.cli:29 - Set the SHELL option... ... 11 findings across 3 Dockerfiles Status: PASS Real lint findings flowing through, no local install required. Doesn't ship a unit test for the docker subprocess path because mocking ``shutil.which("docker")`` plus the ``docker run`` invocation reliably across pytest runs requires more plumbing than the value justifies for a 25-line method that's verified end-to-end above. The test_auto_backend_defers_to_scan_when_no_build_args test from this PR's prior commit covers the engine handoff. * feat(linters): bundle Python tools, add docker fallback for terraform, swap jshint for eslint Three coupled changes that close the "linter requires local install" gap surfaced during the lint-dockerfile validation. 1. Bundle yamllint and flake8 as core dependencies. They are pure Python packages, argus is itself Python, so a clean ``pip install argus-security`` now leaves both linters runnable with no extra setup. Linting is core scan capability rather than an optional extra. Native-binary linters (hadolint, terraform, tflint) keep their container-fallback path instead. 2. Add docker fallback to TerraformLinter, mirroring the hadolint shape from earlier in this PR. terraform fmt + validate run via the official ``hashicorp/terraform:1.9.8`` image; tflint via ``ghcr.io/terraform-linters/tflint:v0.55.1`` (its official image). Workspace mounts read-write because terraform init writes ``.terraform/`` plugin state that validate then reads. Drive-by: tool_version() switches to ``parse_tool_version`` for consistency with the rest of the SDK. 3. Replace jshint with eslint. jshint is niche, last meaningful release was 2022, and there is no trusted official container — eslint is the de-facto JavaScript / TypeScript linter, has active maintenance, and the ``pipelinecomponents/eslint`` image keeps argus out of the business of maintaining a Node container. The new EslintLinter handles the no-config case gracefully (info row, not a failure) since a project without an eslint.config.js isn't an error worth crashing on. Also extends the language list to ``javascript, typescript`` and adds ESLint config detection to argus init's project-signal walk. Container manifest grew three entries — ``terraform``, ``tflint``, ``eslint`` — pinned to specific versions so Renovate keeps them current the same way it tracks the existing tool images. Removed ``argus/linters/jshint.py`` and its registry slot; ``lint-javascript`` now resolves to ``EslintLinter``. Tests: 1519 passing. End-to-end verified locally — argus scan lint-yaml runs cleanly via the bundled yamllint package without any "produced no output" warning. * fix(containers): pin pipelinecomponents/eslint to :latest, no semver tags exist CI's manifest check caught that pipelinecomponents/eslint:0.20.0 doesn't exist — the upstream image tags are commit-SHA based (amd64-6df2a47, arm64-2b4c228, etc.) plus ``:latest`` and ``:edge``, no semver. Pin ``:latest`` and rely on the renovate.yaml ``pinDigests: true`` rule to append an immutable ``@sha256:...`` digest on the first Renovate run. Renovate will then bump the digest on the same 7-day stability lag it applies to every other tracked image. Verified locally: ``python -m scripts.ci.check_container_images`` now reports "All images resolve" with the new entry. * test(eslint): cover EslintLinter scan() and helpers Adds 36 unit tests for the new EslintLinter, covering metadata, config detection across every recognized eslint config filename plus package.json eslintConfig, local + docker command construction, JSON message parsing across severity mappings, and the scan() flow under each branching condition (no config, local eslint, docker fallback, neither available, JSON parse error, tool-error exit code, clean empty-output case). Brings codecov/patch coverage above the threshold for PR #120. --------- Co-authored-by: eFAILution <eFAILution@users.noreply.github.com>
Description
Establish — and validate end-to-end — a single SDK pattern for adding/removing/swapping security tools. Both humans and AI contributors land on the same shape.
The complexity audit run before this PR identified scanner duplication as the highest-leverage simplification target: every scanner rolled its own ~17-line
tool_version(), ~30-linescan(), and a parallel_build_command()(local) +container_args()(container) pair. Those parallel methods had measurably drifted — OSV's_build_commandwas on the v1--sbomflag whilecontainer_argsused v2-L; opengrep's local path used--outputwhile the container path emitted to--output-file; bandit's exit-code semantics differed. The dual structure made every drift invisible in code review.This PR collapses all three shapes into one pattern, applied across every scanner that fits, with documentation updates so the next contributor sees the pattern as the documented shape — not a thing they have to grep for.
Changes Made
Details
Three new SDK helpers (replacing per-scanner duplication):
argus.core.version.parse_tool_version(cmd, regex)<tool> --version, captures via regex, narrowly handles missing-binary / timeout / OSErrorargus.core.scanner_template.run_subprocess_scan(scanner, path, config)scan()scan()body becomes one line:return run_subprocess_scan(self, path, config)argus.core.scanner_template.ScanPaths(workspace, output)_build_command(path, output, config)+container_args(config)methodsbuild_args(paths, config)method that returns the FULL argv. Engine dropsargv[0]automatically when the class declarescontainer_entrypoint = "<bin>"Each scanner now has the same shape:
Migrated scanners: bandit, gitleaks, opengrep, osv, trivy_iac, checkov, clamav (tool_version only), grype (tool_version only), supply_chain (tool_version only), hadolint (tool_version only). The seven that fit the full template each lost ~100 lines of boilerplate.
Engine bridge:
_run_in_containercallsscanner.build_args(ScanPaths(workspace=\"/workspace\", output=\"/output/results.json\"), config)when present, falling back to the legacycontainer_args(config)for the four scanners that don't fit the template (clamav text output, container orchestrator, zap Docker-only, trivy/grype standalone SBOM modes, supply_chain multi-tool). The fallback branch will be removed in a follow-up once those scanners are dealt with case-by-case — they all have structurally-different flows that don't fit a single subprocess template.Drive-by simplifications discovered during the same complexity review:
argus.core.models.Severity— replace 4 hand-rolled comparison methods with@functools.total_ordering+__lt__(-16 lines)argus.init._check_local_readiness— narrowexcept Exceptiontoexcept ImportError. The broad catch hid bugs in the readiness logic itself; per ADR-016 those should surface. Test updated to assert the new loud-on-real-bug postureargus.init.run_initbanner — drop the per-linetime.sleepscroll effect (~1-2s of cosmetic delay on every interactiveargus init)osv-scanner v2syntax (-Lplus--output-file) consistently with the container path, instead of the deprecated v1--sbomand--outputflags it had drifted to. Exactly the kind of bug the unification fixes.Net diff: 32 files changed, +1060 / -759 (the +1060 is mostly the two new helper modules + their 27 unit tests; net code reduction across the migrated scanners is ~300 lines).
Testing
Test Results
parse_tool_version: 13 unit tests — happy paths (capture group, multiline, stderr fallback, optional v-prefix, custom group index, compiled-regex input) + failure paths (no match, missing binary, timeout, OSError, empty output, unexpected exception propagates)run_subprocess_scan+ScanPaths: 14 unit tests — happy paths (write findings, pass workspace through, propagate config, unpack passed_count tuple, unpack metadata-dict tuple, fall back to stdout, override output filename) + failure paths (missing binary, timeout, no output + no stdout, unexpected exception propagates)Security Considerations
Security Details
The exception-narrowing changes (
init._check_local_readiness,tool_version()clauses,run_subprocess_scanfailure handling) move the codebase toward ADR-016's loud-on-failure posture by no longer swallowing arbitrary exceptions. Only narrowly-named expected categories (missing binary, timeout, OS error) get translated toNone/ error-metadata; everything else propagates. No scanner output processing is touched.AI Context Updates (.ai/)
.ai/architecture.yamlupdated (if components/structure changed).ai/workflows.yamlupdated (if commands/tasks changed).ai/decisions.yamlupdated (if design decision made).ai/errors.yamlupdated (if common error addressed).ai/workflows.yaml::add_new_scannernow listsscan_template,build_args,tool_version, andsecret_redactionunder ahelpers:block so an AI agent reading the workflow sees all three pattern entry points at once.Checklist
For New Scanners/Actions (if applicable)
Follow-ups (not blocking)
The 4 scanners that don't fit the subprocess template all have structurally-different flows; they need bespoke treatment, not a forced abstraction:
-oflag (parses stdout)containerorchestratorOnce those are addressed (which may mean: keep custom
scan(), but still adoptbuild_args(ScanPaths)for argv consistency), the engine's legacycontainer_argsfallback branch can be deleted.