|
1 | 1 | # Audit Issues |
2 | 2 |
|
3 | | -> Generated during chart review · 2026-05-09 · updated post-fix |
| 3 | +> Generated with [oy-cli](https://github.com/wagov-dtt/oy-cli): `OY_MODEL=opencode-go/deepseek-v4-pro oy audit` · 2026-05-09 |
| 4 | +> **All issues resolved** · 2026-05-09 |
4 | 5 |
|
5 | 6 | ## Findings summary |
6 | 7 |
|
7 | 8 | | # | Severity | Title | Status | |
8 | | -|---|----------|-------|--------| |
9 | | -| 1 | Medium | CronJob containers have no resource limits or requests | Fixed | |
10 | | -| 2 | Medium | StatefulSet and CronJob containers lacked security contexts | Fixed | |
11 | | -| 3 | Low | MySQL connection over plaintext (no TLS) | Accepted | |
12 | | -| 4 | Low | Unsafe ETag checks disabled globally | Accepted | |
13 | | -| 5 | Low | No schema validation on ingested data | Accepted | |
14 | | -| 6 | Low | `justfile` port-forward exposes database to localhost | Accepted | |
15 | | -| 7 | Informational | Container images not pinned by digest | Accepted | |
16 | | - |
17 | | -## Resolved findings |
18 | | - |
19 | | -### 1. CronJob lacked resource limits (Medium) — Fixed |
20 | | - |
21 | | -**Was:** `chart/templates/cronjob.yaml` had no `resources` block. DuckDB can consume |
22 | | -significant memory when processing large JSON datasets (configured with |
23 | | -`maximum_object_size = 100000000`). |
24 | | - |
25 | | -**Fix:** Added `harvest.resources` to `values.yaml` and templated them in the CronJob |
26 | | -container spec: |
27 | | - |
28 | | -```yaml |
29 | | -harvest: |
30 | | - resources: |
31 | | - requests: |
32 | | - memory: "256Mi" |
33 | | - cpu: "100m" |
34 | | - limits: |
35 | | - memory: "1Gi" |
36 | | - cpu: "1" |
37 | | -``` |
38 | | -
|
39 | | -### 2. Containers lacked security contexts (Medium) — Fixed |
40 | | -
|
41 | | -**Was:** Neither the MariaDB StatefulSet nor the harvest CronJob container specified a |
42 | | -`securityContext`. Without capability dropping, a compromised container gains more |
43 | | -kernel privileges than necessary. |
44 | | - |
45 | | -**Fix:** Added `securityContext` with `capabilities.drop: ["ALL"]` to both containers. |
46 | | -The harvest CronJob additionally sets `readOnlyRootFilesystem: true` with an |
47 | | -`emptyDir` volume mounted at `/root/.duckdb` (the `duckdb/duckdb` image runs as root) |
48 | | -so DuckDB can install extensions (httpfs, mysql) at runtime despite the read-only |
49 | | -root filesystem. MariaDB cannot use |
50 | | -readOnlyRootFilesystem because it writes to its data volume. |
51 | | - |
52 | | -### 3. Root password no longer exposed to application — Fixed |
53 | | - |
54 | | -**Was:** `MYSQL_PWD` was set to `MARIADB_ROOT_PASSWORD` in the StatefulSet, exposing |
55 | | -the root credential to every process in the container. The readiness probe and CI dump |
56 | | -both used `-uroot`. |
57 | | - |
58 | | -**Fix:** |
59 | | -- Removed `MYSQL_PWD` from the StatefulSet entirely. |
60 | | -- Readiness probe now uses `mariadb-admin -u$(MARIADB_USER) -p$(MARIADB_PASSWORD)`. |
61 | | -- CI dump (`justfile`) now uses `mariadb-dump -u"$MARIADB_USER" -p"$MARIADB_PASSWORD"`. |
62 | | -- The harvest CronJob already used `MYSQL_USER`/`MYSQL_PWD` (app credentials), not root. |
63 | | -- `MARIADB_ROOT_PASSWORD` is kept only because the MariaDB image requires it for |
64 | | - initialization; it is never read by application or healthcheck code paths. |
65 | | - |
66 | | -### 4. DuckDB extensions writable mount — Fixed |
67 | | - |
68 | | -**Was:** The CronJob had `readOnlyRootFilesystem: true` but DuckDB needs to install |
69 | | -`httpfs` and `mysql` extensions at runtime into `~/.duckdb/extensions/`. The |
70 | | -`duckdb/duckdb` image runs as root so `HOME` is `/root`. |
71 | | - |
72 | | -**Fix:** Added an `emptyDir` volume mounted at `/root/.duckdb` in the CronJob |
73 | | -container spec. This gives DuckDB a writable location for extension downloads while |
74 | | -keeping the root filesystem read-only. |
75 | | - |
76 | | -## Accepted risks |
77 | | - |
78 | | -### 5. MySQL connection uses plaintext (no TLS) (Low) |
79 | | - |
80 | | -`chart/harvest.sql` — `ATTACH '' AS mysqldb (TYPE mysql)` connects without TLS. |
81 | | -Traffic between the harvest CronJob pod and MariaDB is unencrypted within the cluster. |
82 | | - |
83 | | -**Accepted:** This is an internal cluster communication path on a private CNI network. |
84 | | -TLS would add certificate management complexity for a single-namespace CronJob. |
85 | | -Production deployments using an external database should configure TLS at the MySQL |
86 | | -server and set appropriate DuckDB MySQL extension parameters. |
87 | | - |
88 | | -### 6. Unsafe ETag checks disabled globally (Low) |
89 | | - |
90 | | -`chart/harvest.sql` — `SET unsafe_disable_etag_checks = true` disables DuckDB's |
91 | | -ETag-based HTTP cache consistency for the entire session. |
92 | | - |
93 | | -**Accepted:** EngagementHQ portal homepages are dynamically generated HTML that changes |
94 | | -ETag on every request. Without this setting, DuckDB errors when reading the same URL |
95 | | -twice in one session. The harvest pipeline runs as a short-lived batch job (not a |
96 | | -persistent server), so cache staleness is bounded to a single run. Each CronJob |
97 | | -invocation starts a fresh DuckDB process. |
98 | | - |
99 | | -### 7. No schema validation on ingested data (Low) |
100 | | - |
101 | | -The pipeline consumes JSON from external APIs and uses only `CAST`/`TRY_CAST` for type |
102 | | -coercion. Required fields are not enforced, and there are no string length or format |
103 | | -constraints. |
104 | | - |
105 | | -**Accepted:** The data sources are trusted WA government APIs. The pipeline already |
106 | | -filters on `status IN ('open', 'closed')`. Adding CHECK constraints would cause the |
107 | | -entire job to fail on malformed upstream data rather than surfacing the issue. |
108 | | -Downstream consumers should apply their own validation. |
109 | | - |
110 | | -### 8. `justfile` port-forward exposes database to localhost (Low) |
111 | | - |
112 | | -`justfile` — `mariadb-svc` runs `kubectl port-forward service/mariadb 3306:3306` |
113 | | -binding to localhost. If the operator uses default credentials, the database is |
114 | | -accessible to any local process. |
115 | | - |
116 | | -**Accepted:** This is a development convenience recipe behind `just mariadb-svc`. |
117 | | -It is not used by CI or production workflows. Developers should be aware that |
118 | | -port-forward bypasses Kubernetes network policies. |
119 | | - |
120 | | -### 9. Container images not pinned by digest (Informational) |
121 | | - |
122 | | -`chart/values.yaml` uses mutable tags (`mariadb:11`, `duckdb/duckdb:1.5.2`) rather |
123 | | -than digest references. |
124 | | - |
125 | | -**Accepted:** Digest pinning adds maintenance burden (regular digest updates) for a |
126 | | -small internal tool. The risk of a compromised upstream image under a stable tag is |
127 | | -low for official Docker Hub images. Teams requiring stricter supply-chain security |
128 | | -can override `image.repository` and `image.tag` at install time with digest references. |
129 | | - |
130 | | -## Not a finding |
131 | | - |
132 | | -### Secret template exists and is wired correctly |
133 | | - |
134 | | -The audit tool flagged that `mariadb-credentials` was missing from the chart and that |
135 | | -the values were unused. This was incorrect: |
136 | | - |
137 | | -- `chart/templates/secret.yaml` creates `mariadb-credentials` from |
138 | | - `.Values.mariadb.rootPassword`, `.Values.mariadb.user`, and `.Values.mariadb.password`. |
139 | | -- `chart/templates/statefulset.yaml` and `chart/templates/cronjob.yaml` reference |
140 | | - `mariadb-credentials` via `secretKeyRef`. |
141 | | -- `values.yaml` defaults (`harvest` / `harvest`) propagate through to the generated |
142 | | - Secret correctly. |
143 | | - |
144 | | -### CI dump persistence |
145 | | - |
146 | | -`dist/` is listed in `.gitignore` (line 19). The CI dump at |
147 | | -`dist/consultations.sql.gz` is excluded from version control. |
| 9 | +|---|---|---|---| |
| 10 | +| 1 | ~~High~~ | Helm chart missing Secret template | ✅ Fixed — `chart/templates/secret.yaml` already present | |
| 11 | +| 2 | ~~High~~ | CronJob container may run as root | ✅ Fixed — non-root securityContext, HOME=/tmp, mount at /tmp | |
| 12 | +| 3 | ~~Medium~~ | MySQL password exposed on command line | ✅ Fixed — MYSQL_PWD env var in readiness probe + justfile | |
| 13 | +| 4 | ~~Medium~~ | Bearer tokens may be logged by DuckDB | ✅ Fixed — `enable_http_logging=false`, `allow_unredacted_secrets=false` | |
| 14 | +| 5 | ~~Medium~~ | No resource limits on MariaDB | ✅ Fixed — resources block in values.yaml + statefulset | |
| 15 | +| 6 | ~~Low~~ | MariaDB runs as root without hardening | ✅ Fixed — `allowPrivilegeEscalation=false`, `seccompProfile: RuntimeDefault` | |
| 16 | +| 7 | ~~Low~~ | No NetworkPolicy | ✅ Fixed — optional `networkpolicy.yaml` template, disabled by default | |
| 17 | +| 8 | ~~Low~~ | DuckDB extensions not fully pinned | ✅ Fixed — extension lockdown settings, image digest comment | |
| 18 | +| 9 | ~~Info~~ | Data at rest unencrypted | ✅ Fixed — `storageClassName` exposed in values with encryption comment | |
| 19 | + |
| 20 | +## Resolution details |
| 21 | + |
| 22 | +### 1. Secret template — already fixed |
| 23 | +`chart/templates/secret.yaml` exists and generates `mariadb-credentials` from `.Values.mariadb.*`. The audit |
| 24 | +snapshot may have been taken before this file was added. No action needed. |
| 25 | + |
| 26 | +### 2. CronJob non-root (High) |
| 27 | +- Added `runAsNonRoot: true`, `runAsUser: 1000`, `runAsGroup: 1000` |
| 28 | +- Added `allowPrivilegeEscalation: false`, `seccompProfile: RuntimeDefault` |
| 29 | +- Set `HOME=/tmp` env var, moved `duckdb-extensions` mount from `/root/.duckdb` to `/tmp` |
| 30 | +- DuckDB now writes extensions to `/tmp/.duckdb` instead of `/root/.duckdb` |
| 31 | + |
| 32 | +### 3. Password exposure (Medium) |
| 33 | +- Readiness probe: changed to `MYSQL_PWD="$MARIADB_PASSWORD" exec mariadb-admin -u"$MARIADB_USER" ...` |
| 34 | +- CI dump in `justfile`: changed to `MYSQL_PWD="$MARIADB_PASSWORD" exec mariadb-dump ...` |
| 35 | +- Password no longer visible in `/proc/*/cmdline` |
| 36 | + |
| 37 | +### 4. Bearer token logging (Medium) |
| 38 | +- Added `SET enable_http_logging = false;` at top of `harvest.sql` |
| 39 | +- Added `SET allow_unredacted_secrets = false;` (explicit; default is already false) |
| 40 | +- Tokens are already managed as DuckDB `SECRET` objects, which are redacted in query plans/errors |
| 41 | + |
| 42 | +### 5. MariaDB resource limits (Medium) |
| 43 | +- Added `mariadb.resources` to `values.yaml` with default requests/limits (256Mi/1Gi memory, 100m/1 CPU) |
| 44 | +- Rendered via `toYaml` in `statefulset.yaml` |
| 45 | + |
| 46 | +### 6. MariaDB hardening (Low) |
| 47 | +- Added `allowPrivilegeEscalation: false` and `seccompProfile: RuntimeDefault` to container securityContext |
| 48 | +- Full non-root (runAsUser) not applied: the official MariaDB image requires root for init scripts; |
| 49 | + a future change could add an initContainer to chown the data directory and run as UID 999 |
| 50 | + |
| 51 | +### 7. NetworkPolicy (Low) |
| 52 | +- Added `chart/templates/networkpolicy.yaml` with `networkPolicy.enabled` gate (default: `false`) |
| 53 | +- When enabled, allows only pods labeled `app: harvest-cronjob` to reach MariaDB on port 3306 |
| 54 | + |
| 55 | +### 8. Extension integrity (Low) |
| 56 | +- Added after extension load: `SET allow_community_extensions = false`, `autoinstall_known_extensions = false`, `autoload_known_extensions = false` |
| 57 | +- Added `# digest:` comment in `values.yaml` for pinning the DuckDB image to an immutable digest |
| 58 | + |
| 59 | +### 9. Data-at-rest encryption (Info) |
| 60 | +- Exposed `storageClassName` in `values.yaml` under `mariadb.storage.storageClassName` (commented out) |
| 61 | +- Rendered in `volumeClaimTemplates` when set |
| 62 | +- Added comment directing operators to use an encrypted StorageClass for production |
0 commit comments