Skip to content

Commit 0bf1a26

Browse files
authored
fix(validate): catch typos in containers config and close self-scan UX gaps (#118)
* fix(validate): catch typos and missing fields in containers config Three fixes plus a workflow refactor, all surfaced while running argus against argus's own source and container images. 1. argus validate now validates the top-level containers block. Until now the schema validator recognized 'containers' as a known top-level key but never inspected its contents — typos (e.g. image_path instead of dockerfile), an empty images list, missing both image and dockerfile on an entry, and sub-scanner names outside trivy/grype/ syft all sailed past argus validate and only surfaced (or failed silently) at scan time. New _validate_containers helper in argus/core/schema.py walks the block and produces the same ConfigError objects the rest of the validator already emits, so argus validate and the pre-scan validation in ArgusConfig._load_file both pick it up. 15 unit tests in argus/tests/core/test_schema_containers.py. 2. argus scan container auto-loads argus.yml when no --config is given. The source-scan dispatcher has always done this; the container subcommand used to require an explicit --config FILE even when an argus.yml sat right at the project root with a containers block. Now both flows search the canonical _DEFAULT_CONFIG_NAMES list. CLI flags still take precedence over config-file values. Test in argus/tests/test_cli.py. 3. .ai/ doc cleanup: the recent context refresh claimed argus list and argus version are subcommands. They are not — argus --version is a top-level flag, and the registered subcommand list is now correctly spelled out in .ai/context.yaml and .ai/architecture.yaml. 4. .github/workflows/build-containers.yml: extract the four hard-coded image entries into a preflight matrix job that reads argus.yml containers.images and emits a JSON matrix the build, scan, and test-cli jobs consume. argus.yml becomes the single source of truth for the dogfood image list — adding a fifth scanner image is now one entry, not three matrix edits across the workflow plus the dogfood config. The actual scanning still runs aquasecurity/trivy-action and anchore/scan-action (authored actions, not argus-scanning-argus) — argus.yml drives the matrix, the trust boundary stays outside our own codebase. Tests: 1513 passed (was 1497 before this PR; 16 new from schema tests and the auto-config-load regression test). * feat(validate): surface containers block in success summary The validate-success summary listed scanners, formats, and backend but ignored the top-level containers block entirely. With containers silently absent from the summary the user had no signal that argus validate had inspected it — a typo at the block name (e.g. ``containerz:``) would still show up as a top-level-key warning, but a correctly-named block with valid contents looked identical to no block at all. Add a Containers line that shows: Containers: 4 image(s) - ghcr.io/myorg/scanner-bandit:dev - ghcr.io/myorg/scanner-opengrep:dev ... Containers: discover from docker/, . Containers: 2 image(s) + discover from docker/ The line is only printed when the block is structurally a mapping; if no containers block exists, the validator stays silent (matching the optional nature of the block). Three new regression tests in TestCmdValidate cover the present, absent, and discover variants. --------- Co-authored-by: eFAILution <eFAILution@users.noreply.github.com>
1 parent 4f03e7b commit 0bf1a26

7 files changed

Lines changed: 515 additions & 22 deletions

File tree

.ai/architecture.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -494,15 +494,15 @@ docsite:
494494
config_file: "argus.yml"
495495
cli_commands:
496496
- "argus scan [scanner] --path --config --severity-threshold --format [--interface=terminal|browser]"
497-
- "argus scan container [--image REF | --discover [PATH]] [--scanners trivy,grype,syft] [--no-keep-raw]"
497+
- "argus scan container [--config argus.yml | --image REF | --discover [PATH]] [--scanners trivy,grype,syft] [--no-keep-raw]"
498498
- "argus init [--force] — generate tailored argus.yml from auto-detection"
499-
- "argus list — show registered scanners (SCANNER_REGISTRY)"
500-
- "argus validate [argus.yml] — JSON Schema validation"
499+
- "argus validate — JSON Schema validation of the auto-detected argus.yml"
501500
- "argus report <format> --results-dir --output-dir — re-emit canonical results"
502501
- "argus view [terminal|browser] [PATH] [--port N] [--no-open]"
503502
- "argus mcp — start MCP server over stdio"
504503
- "argus completion {bash,zsh,fish} — shell completion (dynamic from registry)"
505504
- "argus cache info|clean — manage scanner DB cache volumes"
505+
- "argus --version — top-level flag (no `argus version` subcommand exists)"
506506

507507
directory_structure:
508508
"argus/": "Standalone Python SDK (primary interface, ADR-013)"

.ai/context.yaml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,14 +41,16 @@ entrypoints:
4141
cli_subcommands:
4242
scan: "argus scan [scanner ...] — primary entry; supports source, container, lint flows"
4343
init: "argus init — generate a tailored argus.yml from project auto-detection"
44-
list: "argus list — enumerate registered scanners (SCANNER_REGISTRY + LINTER_REGISTRY)"
44+
classify: "argus classify — classify findings by category"
45+
collect: "argus collect — collect scanner outputs"
4546
validate: "argus validate — typecheck argus.yml against the JSON Schema"
4647
view: "argus view [terminal|browser] — interactive triage of argus-results.json"
4748
report: "argus report <format> — re-emit results in another format without re-scanning"
4849
completion: "argus completion {bash,zsh,fish} — shell completion (dynamic from registry)"
4950
mcp: "argus mcp — start the MCP server over stdio for AI-assistant integration"
5051
cache: "argus cache info|clean — manage per-scanner DB cache volumes"
51-
version: "argus version"
52+
cli_flags:
53+
version: "argus --version (top-level flag, NOT a subcommand)"
5254
mcp_server: "argus mcp"
5355
mcp_install: "pip install argus-security[mcp]"
5456
viewer_extras:

.github/workflows/build-containers.yml

Lines changed: 50 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -34,23 +34,53 @@ concurrency:
3434
cancel-in-progress: true
3535

3636
jobs:
37+
# ── Step 0: Build matrix from argus.yml ──────────────────────────────
38+
# Single source of truth for the image list — argus.yml ``containers:``
39+
# block. Removes the duplication where build/, scan/ and test-cli/
40+
# each had a separate hardcoded matrix that could drift out of sync
41+
# with the dogfood scan target list.
42+
matrix:
43+
name: Resolve image matrix
44+
runs-on: ubuntu-latest
45+
timeout-minutes: 5
46+
outputs:
47+
matrix: ${{ steps.build.outputs.matrix }}
48+
steps:
49+
- name: Checkout
50+
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
51+
52+
- name: Build matrix from argus.yml
53+
id: build
54+
# yq is preinstalled on ubuntu-latest. Reads only the local
55+
# checked-out argus.yml (no untrusted input) so the run-step
56+
# is safe to compose inline.
57+
run: |
58+
set -euo pipefail
59+
matrix=$(yq -o=json '.containers.images | map({
60+
"image": (.image | split(":") | .[0] | split("/") | .[-1]),
61+
"image_ref_template": .image,
62+
"dockerfile": .dockerfile,
63+
"context": (.context // ".")
64+
})' argus.yml)
65+
count=$(echo "$matrix" | yq 'length')
66+
if [ "$count" -eq 0 ]; then
67+
echo "::error::argus.yml containers.images is empty — no images to build/scan/test."
68+
exit 1
69+
fi
70+
echo "Resolved $count image(s) from argus.yml"
71+
echo "$matrix" | yq -P
72+
echo "matrix=$(echo "$matrix" | jq -c .)" >> "$GITHUB_OUTPUT"
73+
3774
# ── Step 1: Build all custom images ──────────────────────────────────
3875
build:
3976
name: Build Images
77+
needs: matrix
4078
runs-on: ubuntu-latest
4179
timeout-minutes: 15
4280
strategy:
4381
fail-fast: true
4482
matrix:
45-
include:
46-
- image: scanner-bandit
47-
dockerfile: docker/Dockerfile.bandit
48-
- image: scanner-opengrep
49-
dockerfile: docker/Dockerfile.opengrep
50-
- image: scanner-supply-chain
51-
dockerfile: docker/Dockerfile.supply-chain
52-
- image: cli
53-
dockerfile: docker/Dockerfile.cli
83+
include: ${{ fromJson(needs.matrix.outputs.matrix) }}
5484

5585
steps:
5686
- name: Checkout
@@ -89,17 +119,13 @@ jobs:
89119
# ── Step 2: Scan each image with Trivy + Grype ──────────────────────
90120
scan:
91121
name: Scan ${{ matrix.image }}
92-
needs: [build]
122+
needs: [matrix, build]
93123
runs-on: ubuntu-latest
94124
timeout-minutes: 15
95125
strategy:
96126
fail-fast: false
97127
matrix:
98-
include:
99-
- image: scanner-bandit
100-
- image: scanner-opengrep
101-
- image: scanner-supply-chain
102-
- image: cli
128+
include: ${{ fromJson(needs.matrix.outputs.matrix) }}
103129

104130
steps:
105131
- name: Download image artifact
@@ -220,7 +246,7 @@ jobs:
220246
# ── Step 3: Test argus CLI using built images ───────────────────────
221247
test-cli:
222248
name: Test Argus CLI
223-
needs: [build]
249+
needs: [matrix, build]
224250
runs-on: ubuntu-latest
225251
timeout-minutes: 15
226252

@@ -251,12 +277,18 @@ jobs:
251277
path: /tmp/images
252278

253279
- name: Load and retag images
280+
# Image names come from the matrix output (resolved from
281+
# argus.yml) — no hardcoded list to drift from the dogfood scan
282+
# configuration. ``IMAGE_NAMES`` is a JSON array; jq extracts
283+
# the short names. Reads only env vars and JSON we constructed
284+
# ourselves; no untrusted input.
254285
run: |
286+
set -euo pipefail
255287
for tarball in /tmp/images/*.tar.gz; do
256288
gunzip -c "$tarball" | docker load
257289
done
258290
# Retag from SHA to version tag that containers.py expects
259-
for image in scanner-bandit scanner-opengrep scanner-supply-chain cli; do
291+
for image in $(echo "$IMAGE_NAMES_JSON" | jq -r '.[].image'); do
260292
SHA_TAG="ghcr.io/huntridge-labs/argus/${image}:${GITHUB_SHA}"
261293
VERSION_TAG="ghcr.io/huntridge-labs/argus/${image}:0.7.0"
262294
if docker image inspect "$SHA_TAG" > /dev/null 2>&1; then
@@ -265,6 +297,7 @@ jobs:
265297
done
266298
env:
267299
GITHUB_SHA: ${{ github.sha }}
300+
IMAGE_NAMES_JSON: ${{ needs.matrix.outputs.matrix }}
268301

269302
- name: Package safety check
270303
run: python -m scripts.ci.check_package

argus/cli.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1122,6 +1122,21 @@ def _load_container_config(args: argparse.Namespace) -> dict:
11221122
"""
11231123
config: dict = {}
11241124
config_path = getattr(args, "config", None)
1125+
1126+
# When --config wasn't supplied, auto-detect argus.yml the same way
1127+
# ``argus scan`` (source) does. Source scans have always done this;
1128+
# the container subcommand used to require an explicit --config,
1129+
# which made config-driven container scans feel inconsistent with
1130+
# the rest of the CLI. Search the project root for the canonical
1131+
# filenames; if none exist, fall through with no config (CLI flags
1132+
# alone may still supply targets).
1133+
if not config_path:
1134+
from argus.core.config import _DEFAULT_CONFIG_NAMES
1135+
for candidate in _DEFAULT_CONFIG_NAMES:
1136+
if Path(candidate).is_file():
1137+
config_path = candidate
1138+
break
1139+
11251140
if config_path:
11261141
try:
11271142
import yaml
@@ -2594,6 +2609,31 @@ def cmd_validate(args: argparse.Namespace) -> int:
25942609
backend = data.get("execution", {}).get("backend", "auto")
25952610
print(f" Backend: {backend}")
25962611

2612+
# Containers: only printed when the block exists and is structurally
2613+
# sound (validate already surfaced any errors above). The line gives
2614+
# the user a "yes, your containers config was inspected" signal that
2615+
# was missing — without it, a typo'd top-level key like ``containerz``
2616+
# used to fail silently here too.
2617+
containers = data.get("containers")
2618+
if isinstance(containers, dict):
2619+
images = containers.get("images") or []
2620+
discover = containers.get("discover", False)
2621+
search_paths = containers.get("search_paths") or []
2622+
parts = []
2623+
if isinstance(images, list) and images:
2624+
parts.append(f"{len(images)} image(s)")
2625+
if discover:
2626+
paths_str = ", ".join(search_paths) if search_paths else "."
2627+
parts.append(f"discover from {paths_str}")
2628+
summary = " + ".join(parts) if parts else "no targets"
2629+
print(f" Containers: {summary}")
2630+
if isinstance(images, list) and images:
2631+
for entry in images:
2632+
if not isinstance(entry, dict):
2633+
continue
2634+
ref = entry.get("image") or entry.get("dockerfile") or "<unknown>"
2635+
print(f" - {ref}")
2636+
25972637
# Tool readiness check
25982638
unavailable = []
25992639
tool_statuses = []

argus/core/schema.py

Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,15 @@
3939
# Known execution keys
4040
_EXECUTION_KEYS = {"backend", "registry", "pull_policy"}
4141

42+
# Top-level containers block keys
43+
_CONTAINERS_KEYS = {"images", "discover", "search_paths", "scanners"}
44+
45+
# Per-image entry keys (under containers.images[*])
46+
_CONTAINER_IMAGE_KEYS = {"image", "dockerfile", "context", "name"}
47+
48+
# Sub-scanners argus scan container can dispatch to
49+
_CONTAINER_SUB_SCANNERS = {"trivy", "grype", "syft"}
50+
4251

4352
class ConfigError:
4453
"""A single configuration issue."""
@@ -103,6 +112,11 @@ def validate_config(data: dict) -> list[ConfigError]:
103112
if execution is not None:
104113
errors.extend(_validate_execution("execution", execution))
105114

115+
# Containers (top-level lifecycle targets for ``argus scan container``)
116+
containers = data.get("containers")
117+
if containers is not None:
118+
errors.extend(_validate_containers("containers", containers))
119+
106120
return errors
107121

108122

@@ -240,6 +254,133 @@ def _validate_execution(path: str, data: Any) -> list[ConfigError]:
240254
return errors
241255

242256

257+
def _validate_containers(path: str, data: Any) -> list[ConfigError]:
258+
"""Validate the top-level ``containers:`` block.
259+
260+
Catches the common authoring mistakes that previously only surfaced
261+
at scan time (or got silently ignored): typo'd image-entry keys,
262+
discover without search_paths, an empty images list, sub-scanner
263+
names that aren't trivy/grype/syft, and image entries that name
264+
neither a registry ref nor a Dockerfile.
265+
"""
266+
errors: list[ConfigError] = []
267+
268+
if not isinstance(data, dict):
269+
errors.append(ConfigError(
270+
path, f"Must be a mapping, got {type(data).__name__}",
271+
))
272+
return errors
273+
274+
# Unknown keys
275+
for key in data:
276+
if key not in _CONTAINERS_KEYS:
277+
errors.append(ConfigError(
278+
f"{path}.{key}",
279+
f"Unknown containers key '{key}'. "
280+
f"Valid keys: {', '.join(sorted(_CONTAINERS_KEYS))}",
281+
level="warning",
282+
))
283+
284+
images = data.get("images")
285+
discover = data.get("discover", False)
286+
287+
# At least one source of targets must be configured.
288+
if not images and not discover:
289+
errors.append(ConfigError(
290+
path,
291+
"containers: must declare at least one of `images:` (a list) "
292+
"or `discover: true` — otherwise `argus scan container --config` "
293+
"has no targets to scan.",
294+
))
295+
296+
# images: list of mappings
297+
if images is not None:
298+
if not isinstance(images, list):
299+
errors.append(ConfigError(
300+
f"{path}.images", "Must be a list of image entries",
301+
))
302+
elif len(images) == 0:
303+
errors.append(ConfigError(
304+
f"{path}.images",
305+
"Empty images list — drop the key entirely or add at least one entry.",
306+
level="warning",
307+
))
308+
else:
309+
for i, entry in enumerate(images):
310+
errors.extend(
311+
_validate_container_image_entry(f"{path}.images[{i}]", entry)
312+
)
313+
314+
# discover requires search_paths (or defaults to ["."])
315+
if "search_paths" in data:
316+
sp = data["search_paths"]
317+
if not isinstance(sp, list) or not all(isinstance(p, str) for p in sp):
318+
errors.append(ConfigError(
319+
f"{path}.search_paths",
320+
"Must be a list of path strings",
321+
))
322+
323+
# scanners: must be a list of valid sub-scanner names
324+
if "scanners" in data:
325+
sc = data["scanners"]
326+
if not isinstance(sc, list):
327+
errors.append(ConfigError(
328+
f"{path}.scanners",
329+
f"Must be a list. Valid values: "
330+
f"{', '.join(sorted(_CONTAINER_SUB_SCANNERS))}",
331+
))
332+
else:
333+
for i, s in enumerate(sc):
334+
if s not in _CONTAINER_SUB_SCANNERS:
335+
errors.append(ConfigError(
336+
f"{path}.scanners[{i}]",
337+
f"Unknown container sub-scanner '{s}'. "
338+
f"Valid values: {', '.join(sorted(_CONTAINER_SUB_SCANNERS))}",
339+
))
340+
341+
return errors
342+
343+
344+
def _validate_container_image_entry(path: str, entry: Any) -> list[ConfigError]:
345+
"""Validate a single ``containers.images[*]`` entry."""
346+
errors: list[ConfigError] = []
347+
348+
if not isinstance(entry, dict):
349+
errors.append(ConfigError(
350+
path,
351+
f"Must be a mapping with at least an 'image:' field, "
352+
f"got {type(entry).__name__}",
353+
))
354+
return errors
355+
356+
# Unknown keys
357+
for key in entry:
358+
if key not in _CONTAINER_IMAGE_KEYS:
359+
errors.append(ConfigError(
360+
f"{path}.{key}",
361+
f"Unknown image-entry key '{key}'. "
362+
f"Valid keys: {', '.join(sorted(_CONTAINER_IMAGE_KEYS))}",
363+
level="warning",
364+
))
365+
366+
# An entry must declare either an image ref or a dockerfile to build.
367+
if "image" not in entry and "dockerfile" not in entry:
368+
errors.append(ConfigError(
369+
path,
370+
"Image entry must have either 'image:' (registry reference) "
371+
"or 'dockerfile:' (build-then-scan) set.",
372+
))
373+
374+
# Type checks for present fields
375+
for field in ("image", "dockerfile", "context", "name"):
376+
if field in entry and not isinstance(entry[field], str):
377+
errors.append(ConfigError(
378+
f"{path}.{field}", "Must be a string",
379+
))
380+
381+
return errors
382+
383+
243384
def report_validation(errors: list[ConfigError]) -> bool:
244385
"""Log validation errors/warnings and return True if config is valid.
245386

0 commit comments

Comments
 (0)