Skip to content

Commit 7837c6e

Browse files
authored
feat(scanner-container): exposed-port surface as new sub-scanner (#149)
* feat(scanner-container): exposed-port surface as new sub-scanner Implements the "Container image exposed ports" item from docs/developer/SDK-ROADMAP.md → "Attack Surface Visibility — Port & Service Exposure". Reports what network endpoints a container image declares via Dockerfile EXPOSE — separate from whether those endpoints have known CVEs. "Image exposes 6379/tcp" is a different question from "image has a vulnerable Redis package" and most security reviewers want both. No new scanner module; extends the existing container scanner's sub-scanner orchestration alongside trivy/grype/syft. Default sub-scanner list becomes "trivy,grype,syft,exposure" — opt out by dropping the name from scanners.container.scanners. How it works: - _scan_exposed_ports(image_ref, config) ensures the image is present locally (via container_runtime.pull_image with if-not-present, which is a fast cache hit when trivy/grype/syft already pulled it), runs <runtime> image inspect <ref>, parses Config.ExposedPorts, and emits one Finding per port. - Severity defaults to INFO for ordinary application ports and MEDIUM for ports on the built-in RISKY_PORTS dict: 21/tcp (FTP), 22/tcp (SSH), 23/tcp (Telnet), 25/tcp (SMTP), 110/tcp (POP3), 143/tcp (IMAP), 161/udp (SNMP), 389/tcp (LDAP), 445/tcp (SMB), 3306/tcp (MySQL), 3389/tcp (RDP), 5432/tcp (PostgreSQL), 6379/tcp (Redis), 9200/tcp (Elasticsearch), 11211/tcp (Memcached), 27017/tcp (MongoDB). Each entry cites a "why" in the scanner module docstring (CIS Docker Benchmark §5.8, Shodan unauthorized-database-access reports, CVE-1999-0517 for SNMPv1/v2 community strings, etc.) so future contributors don't tune the list blindly. Config knobs (argus.yml): - scanners.container.expose_warn_ports: list[str] — replaces the built-in WARN list. Empty list demotes every declared port to INFO. - scanners.container.expose_ignore_ports: list[str] — suppress findings entirely. Use for ports the team has explicitly accepted (the app's known 8080/tcp, etc.). Both lists accept "PORT/PROTO" strings; bare "PORT" defaults to tcp; protocol is case-insensitive. Validator rejects malformed entries at config-load time so authoring mistakes surface during argus validate, not at scan time. Finding shape — flows through the existing reporter pipeline (terminal, markdown, sarif, json, github, gitlab, junit), -- severity-threshold filtering, audit trail, and the view-terminal / view-browser UIs without per-reporter custom code: id: EXPOSE-<port>-<proto> severity: INFO or MEDIUM metadata: {port, protocol, common_service, risky, image_ref} Test coverage (29 new tests): - TestParsePortProto: canonical form, bare-port-defaults-tcp, case+whitespace tolerated, 9-case invalid parametrize. - TestScanExposedPorts: single non-risky → INFO, single risky → MEDIUM with service name, multi-port sort+classification, ignore-list suppresses, warn-override replaces defaults, empty-warn-override demotes all to INFO, no exposed ports, no Config block, empty inspect array, no runtime → skipped metadata, pull failure → error metadata, unparseable port logged + skipped (subprocess + container_runtime mocked). - TestExposureSchemaValidation: valid lists accepted, non-list errors, malformed entries error, non-string entries error, "exposure" valid in container sub-scanner list. Out of scope (deferred): - Runtime port enumeration (actually start the container, probe with nmap/ss). Static EXPOSE data is the bulk of the value at a fraction of the operational cost. A runtime variant becomes a separate roadmap item if demand surfaces. Docs + .ai/: - docs/config-reference.md: container scanner description, scanner-specific properties table (new rows for expose_warn_ports and expose_ignore_ports), worked example for attack-surface tuning. - argus.example.yml: commented example showing default sub-scanner set + the two new knobs. - .ai/architecture.yaml: scanners/ description in both SDK blocks updated to mention the four sub-scanners and the RISKY_PORTS watchlist. - docs/developer/SDK-ROADMAP.md: roadmap entry flipped from actionable to shipped with implementation summary. Full suite: 3155 passed (+29 new), 2 skipped. * style: codespell unparseable -> unparsable --------- Co-authored-by: eFAILution <eFAILution@users.noreply.github.com>
1 parent 6dc1ba3 commit 7837c6e

7 files changed

Lines changed: 638 additions & 88 deletions

File tree

.ai/architecture.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ components:
4545
"viewers/__init__.py": "ViewerUnavailable shared exception"
4646
"viewers/terminal/": "`argus view --interface=terminal` — Textual TUI ([terminal] extra). Includes DiffPickerScreen + DiffScreen for scan-over-scan diff (``D`` keybind), reusing argus.core.findings_view.diff_scans."
4747
"viewers/browser/": "`argus view --interface=browser` — FastAPI + Jinja2 web UI, 127.0.0.1 only ([browser] extra). Routes include ``/diff?a=<path>&b=<path>`` powered by argus.core.findings_view.diff_scans, sharing the bucketing logic with the TUI's ``D`` keybind."
48-
"scanners/": "Scanner modules implementing Scanner protocol (SCANNER_REGISTRY includes linters via auto-merge)"
48+
"scanners/": "Scanner modules implementing Scanner protocol (SCANNER_REGISTRY includes linters via auto-merge). The container scanner orchestrates four sub-scanners on every run: trivy (CVE scan), grype (CVE scan, deduplicated against trivy), syft (SBOM), and exposure (declared-port surface from docker inspect Config.ExposedPorts). The exposure sub-scanner emits one Finding per declared port; INFO severity for ordinary application ports and MEDIUM for ports on the built-in RISKY_PORTS list (SSH 22/tcp, MySQL 3306/tcp, Redis 6379/tcp, PostgreSQL 5432/tcp, MongoDB 27017/tcp, etc.). Config knobs scanners.container.expose_warn_ports / expose_ignore_ports override or suppress."
4949
"linters/": "Linter modules implementing Scanner protocol (LINTER_REGISTRY auto-merges into SCANNER_REGISTRY)"
5050
"reporters/": "Output reporters (terminal, markdown, sarif, json, github, gitlab, junit). Discovered via the ``argus.reporters`` Python entry-point group (built-ins declared in pyproject.toml; third-party packages register additional formats without forking — see docs/contributing-reporters.md and ADR-023)."
5151
"preflight/": "CI preflight: provider detection, living issue reporting (GitHub/GitLab), network deps, scanner tool-readiness checks (tool_check.py)"
@@ -480,7 +480,7 @@ docsite:
480480
"viewers/__init__.py": "ViewerUnavailable shared exception"
481481
"viewers/terminal/": "`argus view --interface=terminal` — Textual TUI ([terminal] extra). Includes DiffPickerScreen + DiffScreen for scan-over-scan diff (``D`` keybind), reusing argus.core.findings_view.diff_scans."
482482
"viewers/browser/": "`argus view --interface=browser` — FastAPI + Jinja2 web UI, 127.0.0.1 only ([browser] extra). Routes include ``/diff?a=<path>&b=<path>`` powered by argus.core.findings_view.diff_scans, sharing the bucketing logic with the TUI's ``D`` keybind."
483-
"scanners/": "Scanner modules implementing Scanner protocol (SCANNER_REGISTRY includes linters via auto-merge)"
483+
"scanners/": "Scanner modules implementing Scanner protocol (SCANNER_REGISTRY includes linters via auto-merge). The container scanner orchestrates four sub-scanners on every run: trivy (CVE scan), grype (CVE scan, deduplicated against trivy), syft (SBOM), and exposure (declared-port surface from docker inspect Config.ExposedPorts). The exposure sub-scanner emits one Finding per declared port; INFO severity for ordinary application ports and MEDIUM for ports on the built-in RISKY_PORTS list (SSH 22/tcp, MySQL 3306/tcp, Redis 6379/tcp, PostgreSQL 5432/tcp, MongoDB 27017/tcp, etc.). Config knobs scanners.container.expose_warn_ports / expose_ignore_ports override or suppress."
484484
"linters/": "Linter modules implementing Scanner protocol (LINTER_REGISTRY auto-merges into SCANNER_REGISTRY)"
485485
"reporters/": "Output reporters (terminal, markdown, sarif, json, github, gitlab, junit). Discovered via the ``argus.reporters`` Python entry-point group (built-ins declared in pyproject.toml; third-party packages register additional formats without forking — see docs/contributing-reporters.md and ADR-023)."
486486
"tests/": "Co-located pytest tests (180 tests, 83%+ coverage)"

argus.example.yml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,15 @@ scanners:
3838
# container:
3939
# enabled: false
4040
# image_ref: "myapp:latest"
41-
# scanners: "trivy,grype,syft"
41+
# # Default set: trivy + grype CVEs, syft SBOM, exposure declared-ports.
42+
# scanners: "trivy,grype,syft,exposure"
43+
# # Attack-surface knobs for the exposure sub-scanner:
44+
# # expose_warn_ports: override the built-in WARN list
45+
# # (defaults: 22 SSH, 3306 MySQL, 6379 Redis,
46+
# # 5432 PostgreSQL, 27017 MongoDB, etc.)
47+
# # expose_ignore_ports: suppress findings entirely
48+
# # expose_warn_ports: ["22/tcp", "3306/tcp"]
49+
# # expose_ignore_ports: ["443/tcp", "8080/tcp"]
4250
# # Private-registry auth: name an env var, never paste a literal.
4351
# # The runner / shell exports the value; argus reads it at scan time.
4452
# registry_username_env: REGISTRY_USER

argus/core/schema.py

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,8 @@
3838
# Credential fields (either form: literal or <field>_env)
3939
"registry_username", "registry_password",
4040
"registry_username_env", "registry_password_env",
41+
# Container exposure sub-scanner tuning
42+
"expose_warn_ports", "expose_ignore_ports",
4143
# ZAP-specific tuning (decided in ADR-024)
4244
"api_spec", "rules_file", "cmd_options",
4345
"max_duration_minutes", "healthcheck_url",
@@ -81,7 +83,7 @@
8183
_CONTAINER_IMAGE_KEYS = {"image", "dockerfile", "context", "name", "cleanup"}
8284

8385
# Sub-scanners argus scan container can dispatch to
84-
_CONTAINER_SUB_SCANNERS = {"trivy", "grype", "syft"}
86+
_CONTAINER_SUB_SCANNERS = {"trivy", "grype", "syft", "exposure"}
8587

8688

8789
class ConfigError:
@@ -215,6 +217,40 @@ def _validate_scanner(path: str, data: Any) -> list[ConfigError]:
215217
f"Must be a positive integer, got {v!r}",
216218
))
217219

220+
# Container exposure sub-scanner tuning — both lists must be
221+
# lists of ``"PORT/PROTO"`` strings (protocol defaults to tcp
222+
# when omitted; case-insensitive).
223+
if scanner_name == "container":
224+
for key in ("expose_warn_ports", "expose_ignore_ports"):
225+
if key not in data:
226+
continue
227+
value = data[key]
228+
if not isinstance(value, list):
229+
errors.append(ConfigError(
230+
f"{path}.{key}",
231+
f"Must be a list of \"PORT/PROTO\" strings, "
232+
f"got {type(value).__name__}",
233+
))
234+
continue
235+
for entry in value:
236+
if not isinstance(entry, str):
237+
errors.append(ConfigError(
238+
f"{path}.{key}",
239+
f"Entry must be a string \"PORT/PROTO\", "
240+
f"got {type(entry).__name__} ({entry!r})",
241+
))
242+
continue
243+
# Validate via the scanner's parser so the schema and
244+
# the runtime agree on what's well-formed.
245+
from argus.scanners.container import _parse_port_proto
246+
if _parse_port_proto(entry) is None:
247+
errors.append(ConfigError(
248+
f"{path}.{key}",
249+
f"'{entry}' is not a valid PORT/PROTO entry. "
250+
"Expected '<port>/<tcp|udp|sctp>' (e.g. '22/tcp') "
251+
"or bare '<port>' which defaults to tcp.",
252+
))
253+
218254
# Warn on unknown keys (after credential / nested-block handling so
219255
# we don't double-warn on the keys we already validated).
220256
for key in data:

argus/scanners/container.py

Lines changed: 231 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
"""Container scanner orchestrating Trivy, Grype, and Syft."""
1+
"""Container scanner orchestrating Trivy, Grype, Syft, and exposed-port surface."""
22

33
import json
44
import logging
@@ -14,6 +14,90 @@
1414
logger = logging.getLogger("argus")
1515

1616

17+
# ── Risky-default ports for the ``exposure`` sub-scanner ─────────────
18+
#
19+
# Services on this list are not vulnerabilities per se — the port being
20+
# *declared* via Dockerfile EXPOSE is itself harmless. The risk is that
21+
# these services historically ship with weak defaults (no-auth Redis,
22+
# unauthenticated PostgreSQL trust mode, SMB anonymous binding, etc.)
23+
# *and* are surprisingly often inherited by application images that
24+
# never actually intend to expose them — e.g. a base image that
25+
# EXPOSEs port 22 because openssh-server got pulled in as a transitive
26+
# dependency. The WARN severity prompts a "did you mean to expose
27+
# this?" review without falsely implying a known CVE.
28+
#
29+
# Keys are ``(port, protocol)`` tuples; values are the service name
30+
# used in the finding title. Operators can override via
31+
# ``scanners.container.expose_warn_ports`` in argus.yml or suppress
32+
# any finding entirely via ``scanners.container.expose_ignore_ports``.
33+
#
34+
# Sources for each entry:
35+
# - 21/tcp, 23/tcp: cleartext protocols (FTP, Telnet) — categorically
36+
# unsafe on any public network; CIS Docker Benchmark §5.8.
37+
# - 22/tcp: SSH in a container is a recurring image-inheritance leak
38+
# (k8s.io/community#kubectl-exec-vs-ssh-in-pod discussion thread).
39+
# - 25/tcp, 110/tcp, 143/tcp: legacy mail protocols with cleartext
40+
# auth in default configs.
41+
# - 161/udp: SNMPv1/v2 default community strings (``public``); CVE-1999-0517.
42+
# - 389/tcp: LDAP cleartext bind; 636/tcp (LDAPS) is the encrypted
43+
# alternative and is not warned.
44+
# - 445/tcp: SMB; never appropriate from a containerized workload
45+
# without an explicit reason.
46+
# - 3306/tcp (MySQL), 5432/tcp (PostgreSQL), 6379/tcp (Redis),
47+
# 9200/tcp (Elasticsearch), 11211/tcp (Memcached), 27017/tcp
48+
# (MongoDB): default no-auth configurations. The Shodan
49+
# "Unauthorized Database Access" reports cite these by name.
50+
# - 3389/tcp: RDP — same rationale as SSH plus auth-bypass CVE history.
51+
#
52+
# Adding a new entry requires citing a "why" in this docstring; rule
53+
# is to keep operators from tuning the list blindly.
54+
RISKY_PORTS: dict[tuple[int, str], str] = {
55+
(21, "tcp"): "FTP",
56+
(22, "tcp"): "SSH",
57+
(23, "tcp"): "Telnet",
58+
(25, "tcp"): "SMTP",
59+
(110, "tcp"): "POP3",
60+
(143, "tcp"): "IMAP",
61+
(161, "udp"): "SNMP",
62+
(389, "tcp"): "LDAP",
63+
(445, "tcp"): "SMB",
64+
(3306, "tcp"): "MySQL",
65+
(3389, "tcp"): "RDP",
66+
(5432, "tcp"): "PostgreSQL",
67+
(6379, "tcp"): "Redis",
68+
(9200, "tcp"): "Elasticsearch",
69+
(11211, "tcp"): "Memcached",
70+
(27017, "tcp"): "MongoDB",
71+
}
72+
73+
74+
def _parse_port_proto(raw: str) -> tuple[int, str] | None:
75+
"""Parse a ``PORT/PROTO`` string into ``(port, protocol)``.
76+
77+
Accepts ``"22/tcp"`` (canonical), ``"22"`` (defaults to tcp),
78+
``" 22 / TCP "`` (whitespace + case tolerated). Returns ``None``
79+
if the input doesn't parse — callers log + skip.
80+
"""
81+
if not isinstance(raw, str):
82+
return None
83+
cleaned = raw.strip().lower().replace(" ", "")
84+
if not cleaned:
85+
return None
86+
if "/" in cleaned:
87+
port_str, proto = cleaned.split("/", 1)
88+
else:
89+
port_str, proto = cleaned, "tcp"
90+
try:
91+
port = int(port_str)
92+
except ValueError:
93+
return None
94+
if port < 1 or port > 65535:
95+
return None
96+
if proto not in ("tcp", "udp", "sctp"):
97+
return None
98+
return (port, proto)
99+
100+
17101
class ContainerScanner:
18102
"""Wraps Trivy, Grype, and Syft for container image scanning."""
19103

@@ -98,6 +182,13 @@ def scan(self, path: str, config: dict | None = None) -> ScanResult:
98182
)
99183
metadata["syft"] = syft_meta
100184

185+
if "exposure" in enabled:
186+
exposure_findings, exposure_meta = self._scan_exposed_ports(
187+
image_ref, config,
188+
)
189+
all_findings.extend(exposure_findings)
190+
metadata["exposure"] = exposure_meta
191+
101192
if not metadata:
102193
metadata["error"] = (
103194
"None of the enabled scanners "
@@ -157,10 +248,147 @@ def parse_grype_results(self, raw_output_path: Path) -> list[Finding]:
157248
# ------------------------------------------------------------------
158249

159250
def _enabled_scanners(self, config: dict) -> list[str]:
160-
"""Return list of enabled sub-scanner names from config."""
161-
raw = config.get("scanners", "trivy,grype,syft")
251+
"""Return list of enabled sub-scanner names from config.
252+
253+
Default set covers vulnerability scanning (trivy, grype),
254+
SBOM generation (syft), and attack-surface visibility
255+
(exposure — declared Dockerfile EXPOSE ports). Disable any
256+
of them explicitly via the ``scanners`` config key.
257+
"""
258+
raw = config.get("scanners", "trivy,grype,syft,exposure")
162259
return [s.strip().lower() for s in raw.split(",") if s.strip()]
163260

261+
def _scan_exposed_ports(
262+
self, image_ref: str, config: dict,
263+
) -> tuple[list[Finding], dict]:
264+
"""Read ``Config.ExposedPorts`` from the image manifest.
265+
266+
One ``Finding`` per declared port:
267+
- severity INFO for ordinary application ports;
268+
- severity WARN for ports on the built-in ``RISKY_PORTS``
269+
list (or the operator's override).
270+
Config knobs:
271+
``scanners.container.expose_warn_ports`` – override the
272+
built-in WARN list. Replaces the default; pass an empty
273+
list to suppress all WARN-severity findings.
274+
``scanners.container.expose_ignore_ports`` – suppress findings
275+
entirely for these ports (intended for ports the team
276+
has explicitly accepted, e.g. their app's known 8080/tcp).
277+
Both lists take ``"PORT/PROTO"`` strings.
278+
"""
279+
from argus import container_runtime
280+
281+
rt = container_runtime.runtime_cmd()
282+
if not container_runtime.is_available():
283+
return [], {
284+
"skipped": "no container runtime available — install Docker, "
285+
"Podman, or nerdctl to enable exposed-port discovery",
286+
}
287+
288+
# Ensure the image is present locally before inspecting.
289+
# ``if-not-present`` is a fast cache hit when trivy/grype/syft
290+
# already pulled the image in this scan run.
291+
if not container_runtime.pull_image(image_ref, policy="if-not-present"):
292+
return [], {
293+
"error": f"could not pull or locate image {image_ref} for inspection",
294+
}
295+
296+
result = subprocess.run(
297+
[rt, "image", "inspect", image_ref],
298+
capture_output=True, text=True,
299+
)
300+
if result.returncode != 0:
301+
return [], {
302+
"error": (
303+
f"docker inspect failed (rc={result.returncode}): "
304+
f"{result.stderr.strip()[:300]}"
305+
),
306+
}
307+
308+
try:
309+
inspected = json.loads(result.stdout)
310+
except json.JSONDecodeError as exc:
311+
return [], {"error": f"could not parse docker inspect output: {exc}"}
312+
313+
if not isinstance(inspected, list) or not inspected:
314+
return [], {"error": "docker inspect returned no image entries"}
315+
316+
config_block = inspected[0].get("Config") or {}
317+
exposed = config_block.get("ExposedPorts") or {}
318+
319+
# Resolve config-driven WARN-list override and ignore-list.
320+
warn_override = config.get("expose_warn_ports")
321+
if warn_override is not None:
322+
# Operator-provided list REPLACES the built-in defaults.
323+
warn_set = {
324+
pp for raw in warn_override
325+
if (pp := _parse_port_proto(raw)) is not None
326+
}
327+
else:
328+
warn_set = set(RISKY_PORTS.keys())
329+
330+
ignore_set = {
331+
pp for raw in (config.get("expose_ignore_ports") or [])
332+
if (pp := _parse_port_proto(raw)) is not None
333+
}
334+
335+
findings: list[Finding] = []
336+
ignored_count = 0
337+
for raw_port in sorted(exposed.keys()):
338+
parsed = _parse_port_proto(raw_port)
339+
if parsed is None:
340+
logger.warning(
341+
"Skipping unparsable port reference '%s' in %s ExposedPorts",
342+
raw_port, image_ref,
343+
)
344+
continue
345+
port, proto = parsed
346+
if (port, proto) in ignore_set:
347+
ignored_count += 1
348+
continue
349+
350+
is_risky = (port, proto) in warn_set
351+
service = RISKY_PORTS.get((port, proto))
352+
severity = Severity.MEDIUM if is_risky else Severity.INFO
353+
title_service = f" ({service})" if service else ""
354+
description = (
355+
f"Image declares EXPOSE for port {port}/{proto}{title_service}. "
356+
+ (
357+
"This is on the risky-defaults watchlist — services on "
358+
"this port have a history of weak default configurations. "
359+
"Confirm the container actually intends to listen here "
360+
"and that authentication/TLS is in front of it."
361+
if is_risky else
362+
"Declared exposed port — informational. No action required "
363+
"unless the port is unexpected for this image."
364+
)
365+
)
366+
findings.append(
367+
Finding(
368+
id=f"EXPOSE-{port}-{proto}",
369+
severity=severity,
370+
title=(
371+
f"Port {port}/{proto}{title_service} declared exposed"
372+
),
373+
description=description,
374+
scanner=self.name,
375+
metadata={
376+
"port": port,
377+
"protocol": proto,
378+
"common_service": service or "",
379+
"risky": is_risky,
380+
"image_ref": image_ref,
381+
},
382+
),
383+
)
384+
385+
return findings, {
386+
"execution": "local-inspect",
387+
"ports_declared": len(exposed),
388+
"ports_reported": len(findings),
389+
"ports_ignored": ignored_count,
390+
}
391+
164392
def _build_env(self, config: dict) -> dict[str, str]:
165393
"""Build environment dict with optional registry credentials.
166394

0 commit comments

Comments
 (0)