Skip to content

Commit c9e4ca9

Browse files
committed
feat(reporters): always emit canonical argus-results.json regardless of config
Continues the work in this PR by addressing the *root* of the missing-results pitfall: the canonical scan artifact, not just the error message. Before this commit, ``reporting.formats`` controlled whether ``argus-results.json`` got written. A config like ``formats: [terminal, sarif]`` would silently break ``argus view``, the audit manifest's lossless dump, and ``argus report`` because the JSON file simply wasn't there. This commit shifts the contract: - ``argus-results.json`` is always emitted by the source-scan flow. It's the canonical artifact every other Argus surface depends on. - ``reporting.formats`` now means "which *additional* human-readable reports to emit alongside the canonical JSON," not "which artifacts exist at all." Implementation - New ``argus.reporters.ensure_canonical_json(formats)`` helper — idempotent, preserves user-configured ordering, appends ``json`` when absent. Lives next to ``REPORTER_REGISTRY`` and ``CANONICAL_FORMAT`` so the canonical artifact's identity is one module-level constant away. - ``argus/cli.py`` source-scan dispatch loop iterates the helper-augmented format list. The diagnoser remains as a defensive belt-and-suspenders for legacy result dirs produced before this contract was in place. - ``argus.example.yml`` comment block updated so users see the new contract: "argus-results.json is always written; this list is for additional reports." Available formats listed inline. Scope notes - Container and DAST flows have their own JSON helpers (``_write_container_json`` / ``_write_dast_json``) that produce domain-shaped summaries, not ``ScanSummary.to_dict()``. They're consumed by their own viewers/handling, separate from ``argus view``. Adding "always emit argus-results.json" to those flows would conflate two different artifacts; left for follow-up if a clear use case arises. Tests (+8) - ``argus/tests/reporters/test_registry.py`` (7 cases): idempotent on input that already lists json; preserves user order; defensive-no-mutation; empty-formats edge case; constant-name sanity. Decoupled from the cli.py dispatch loop so the helper's invariants are testable without spinning up the engine. - ``argus/tests/test_cli.py`` (1 case): integration regression for the source-scan flow — captures the format names the dispatch loop requests from ``get_reporter`` when configured with ``formats=[terminal]``, asserts ``json`` is in the captured list. Validation - Full SDK suite green: 1428 passed (+8 from this commit), 8 skipped. - The diagnoser PR's failure-path tests still pass (the failure mode is now extremely rare in practice but the messages remain in case a user loads an older results dir or hits an unrelated path issue).
1 parent dea921f commit c9e4ca9

5 files changed

Lines changed: 158 additions & 3 deletions

File tree

argus.example.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,11 @@ scanners:
4444
# target_url: "http://localhost:3000"
4545

4646
reporting:
47+
# ``argus-results.json`` is the canonical scan artifact and is
48+
# always written regardless of this list — the viewers, the audit
49+
# manifest, and the ``argus report`` subcommand all consume it.
50+
# This list selects which *additional* human-readable reports to
51+
# emit alongside the JSON. Available: terminal, markdown, sarif.
4752
formats:
4853
- terminal
4954
- sarif

argus/cli.py

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1297,10 +1297,19 @@ def _cmd_source_scan(args: argparse.Namespace) -> int:
12971297
finalize_manifest(manifest, exit_code=EXIT_ERROR, output_dir=output_dir)
12981298
return EXIT_ERROR
12991299

1300-
# Generate reports
1300+
# Generate reports.
1301+
#
1302+
# ``ensure_canonical_json`` guarantees the source-of-truth artifact
1303+
# (``argus-results.json``) is always written, regardless of what
1304+
# the user listed in ``reporting.formats``. The viewers, the audit
1305+
# manifest, and the ``argus report`` subcommand all consume that
1306+
# file — keeping it implicitly mandatory means a config like
1307+
# ``formats: [terminal, sarif]`` no longer silently breaks
1308+
# ``argus view`` (the diagnoser still helps for legacy result dirs
1309+
# produced before this contract was in place).
13011310
try:
1302-
from argus.reporters import get_reporter
1303-
for fmt in config.reporting.formats:
1311+
from argus.reporters import ensure_canonical_json, get_reporter
1312+
for fmt in ensure_canonical_json(config.reporting.formats):
13041313
reporter = get_reporter(fmt)
13051314
reporter.report(summary, output_dir)
13061315
log.debug("Generated %s report", fmt)

argus/reporters/__init__.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,3 +27,30 @@ def get_reporter(name: str):
2727
def available_reporters() -> list[str]:
2828
"""Return list of registered reporter names."""
2929
return list(REPORTER_REGISTRY.keys())
30+
31+
32+
# The single canonical scan artifact. ``argus-results.json`` is consumed
33+
# by the audit manifest, both viewers (terminal + browser), the
34+
# ``argus report`` subcommand, and any downstream tooling built on the
35+
# SDK. Treating it as always-emitted decouples its existence from
36+
# user-configured ``reporting.formats``: that list now means "which
37+
# *additional* human-readable reports to emit alongside the canonical
38+
# JSON," not "which artifacts exist at all." Eliminates the failure
39+
# mode where a config like ``formats: [terminal, sarif]`` silently
40+
# breaks ``argus view``.
41+
CANONICAL_FORMAT = "json"
42+
43+
44+
def ensure_canonical_json(formats: list[str]) -> list[str]:
45+
"""Return the format list with the canonical JSON output guaranteed.
46+
47+
Idempotent — if the user already lists ``json`` we don't add a
48+
duplicate (which would write the file twice). Order is preserved
49+
so the user's terminal/markdown/sarif reports still print in the
50+
sequence they configured; the canonical JSON is appended at the
51+
end so it's always the last reporter to run (its dict-dump output
52+
isn't influenced by side-effects of earlier reporters).
53+
"""
54+
if CANONICAL_FORMAT in formats:
55+
return list(formats)
56+
return [*formats, CANONICAL_FORMAT]
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
"""Tests for argus.reporters registry helpers — ``ensure_canonical_json``.
2+
3+
The helper guarantees ``argus-results.json`` is always emitted by the
4+
source-scan flow, regardless of how the user configures
5+
``reporting.formats``. That decouples the canonical scan artifact (the
6+
audit manifest, both viewers, and ``argus report`` all consume it) from
7+
user choice of additional human-readable reports.
8+
"""
9+
10+
from __future__ import annotations
11+
12+
from argus.reporters import CANONICAL_FORMAT, ensure_canonical_json
13+
14+
15+
class TestEnsureCanonicalJson:
16+
def test_appends_json_when_absent(self):
17+
assert ensure_canonical_json(["terminal"]) == ["terminal", "json"]
18+
19+
def test_idempotent_when_json_already_present(self):
20+
# User explicitly listed json — don't double-write the file.
21+
assert ensure_canonical_json(["json"]) == ["json"]
22+
23+
def test_preserves_user_order_when_json_already_present(self):
24+
# The user's preferred ordering of human reports stays intact.
25+
assert ensure_canonical_json(["json", "terminal", "sarif"]) == [
26+
"json", "terminal", "sarif",
27+
]
28+
29+
def test_appends_json_to_multi_format_list(self):
30+
# Common production case: user wants terminal + SARIF + audit JSON.
31+
assert ensure_canonical_json(["terminal", "sarif"]) == [
32+
"terminal", "sarif", "json",
33+
]
34+
35+
def test_handles_empty_input(self):
36+
# Edge case: a config with ``formats: []`` still produces the
37+
# canonical artifact. Without this, the viewers would silently
38+
# fail downstream.
39+
assert ensure_canonical_json([]) == ["json"]
40+
41+
def test_does_not_mutate_input(self):
42+
# Defensive: the helper must not mutate the caller's list, since
43+
# the same list lives on the user's ArgusConfig.
44+
formats = ["terminal"]
45+
ensure_canonical_json(formats)
46+
assert formats == ["terminal"]
47+
48+
def test_canonical_format_constant_is_json(self):
49+
# Sanity-check the constant name. If we ever rename the
50+
# canonical artifact, every consumer downstream needs to know.
51+
assert CANONICAL_FORMAT == "json"

argus/tests/test_cli.py

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -290,6 +290,69 @@ def test_scan_unknown_scanner_returns_error(self, monkeypatch, capsys):
290290
captured = capsys.readouterr()
291291
assert "unknown scanner 'nonexistent'" in captured.err
292292

293+
def test_scan_source_always_emits_canonical_json(self, monkeypatch, tmp_path):
294+
"""Regression for Option C: argus-results.json must be written
295+
regardless of the user's ``reporting.formats``. Captures the
296+
format names the cli.py loop asks ``get_reporter`` for, and
297+
asserts 'json' is in the list even though the user configured
298+
formats=[terminal] only."""
299+
from argus.core.config import ArgusConfig, ReportingConfig, ExecutionConfig
300+
from argus.core.models import ScanSummary
301+
302+
config = ArgusConfig(
303+
reporting=ReportingConfig(
304+
output_dir=str(tmp_path),
305+
formats=["terminal"], # deliberately omits json
306+
severity_threshold=None,
307+
),
308+
execution=ExecutionConfig(),
309+
)
310+
monkeypatch.setattr(
311+
"argus.core.config.ArgusConfig.load",
312+
lambda _path: config,
313+
)
314+
315+
summary = ScanSummary(results=[], severity_threshold=None)
316+
monkeypatch.setattr(
317+
"argus.core.engine.ArgusEngine.__init__",
318+
lambda self, _cfg: setattr(self, "config", config)
319+
or setattr(self, "_scanners", {}),
320+
)
321+
monkeypatch.setattr(
322+
"argus.core.engine.ArgusEngine.run",
323+
lambda self, **kwargs: summary,
324+
)
325+
monkeypatch.setattr(
326+
"argus.core.engine.ArgusEngine.register_scanner",
327+
lambda self, s: None,
328+
)
329+
monkeypatch.setattr("argus.scanners.get_available_scanners", lambda: [])
330+
331+
# Capture every format name the dispatch loop requests so we
332+
# can assert canonical JSON was demanded alongside the user's
333+
# configured formats.
334+
requested: list[str] = []
335+
336+
def capture_reporter(fmt):
337+
requested.append(fmt)
338+
return MagicMock()
339+
340+
monkeypatch.setattr("argus.reporters.get_reporter", capture_reporter)
341+
monkeypatch.setattr("argus.audit.get_logger", lambda *a, **kw: MagicMock())
342+
monkeypatch.setattr(
343+
"argus.audit.create_manifest",
344+
lambda **kw: MagicMock(execution_backend=None),
345+
)
346+
monkeypatch.setattr("argus.audit.finalize_manifest", lambda *a, **kw: None)
347+
348+
args = _make_scan_args(output_dir=str(tmp_path))
349+
cmd_scan(args)
350+
351+
# Canonical JSON is requested even though config didn't list it.
352+
# User's terminal report is still emitted.
353+
assert "json" in requested
354+
assert "terminal" in requested
355+
293356
def test_scan_source_runs_engine(self, monkeypatch, tmp_path):
294357
"""A valid scan with no findings should call engine.run and return EXIT_SUCCESS."""
295358
from argus.core.config import ArgusConfig, ReportingConfig, ExecutionConfig

0 commit comments

Comments
 (0)