helm: Add helm charts for deployment and deploy previews by jon-bell · Pull Request #741 · pawtograder/platform

jon-bell · 2026-04-29T10:37:42Z

Summary by CodeRabbit

New Features
- Added Kubernetes/Helm deployment support for self-hosted deployments with configurable components and persistent storage
- Added automated per-PR preview environments with automatic provisioning and teardown
Improvements
- Improved Docker build resilience with retry logic for dependency installation
- Enhanced CI/CD automation for image building, publishing, and chart release workflows
Tests
- Added comprehensive performance testing suite with realtime messaging and submission load scenarios
Chores
- Added utilities for secure key generation and environment configuration management
- Updated build configuration for Kubernetes chart and performance test packaging

The in-cluster runner has dind and the local GHA cache pre-wired, so docker builds finish faster and don't burn GitHub-hosted minutes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Staging-driven chart improvements - postgres: shared_preload_libraries cleanup, init scripts run as supabase_admin so pgsodium/vault initialise; pgsodium key mounted via secret + getkey_script GUC on both pgsodium and supabase_vault - realtime: AES-128 enc key sized correctly; SELF_HOST_TENANT_NAME matches Kong upstream; API_JWT_JWKS reads new EC-only JWT_REALTIME_JWKS bundle (Joken v2.5.0 can't accept oct JWK maps for HS256, so split) - rest: env ordering so PGRST_DB_URI sees POSTGRES_PASSWORD - auth: include oct key (verify-only) in GOTRUE_JWT_KEYS so HS256-signed ANON/SERVICE_ROLE keys still validate against asymmetric session JWKS - meta: add CRYPTO_KEY (matches Studio's PG_META_CRYPTO_KEY) so saved connection strings encrypt - studio: HOSTNAME=0.0.0.0, PG_META_CRYPTO_KEY, EMPTYDIR_FOR_SNIPPETS, + separate Ingress (templates/ingress-studio.yaml) with basic-auth - kong: env-rendering init container + worker count cap - edge-functions: optional E2E_ENABLE / E2E_MOCK_GITHUB / E2E_SECRET - migrations: wait-for-supabase-services init container so kong/realtime probes don't race the schema; runs as install resource not pre-install Per-PR preview infra - examples/values-preview.yaml: ceph-rbd postgres, synchronous_commit=off, single-replica services, MinIO storage; rest sized 1Gi (512Mi was OOM) - secrets-bootstrap-{rbac,script,job}.yaml: pre-install Helm hook that generates JWT/postgres/E2E secrets in-cluster on first install (no bao, no ESO required); idempotent — Helm re-installs are no-ops - .github/workflows/preview.yml: pull_request open/sync/closed + workflow_dispatch; builds web/edge-functions/migrations on the pawtograder-ci runner, deploys via the chart, comments URL via sticky comment, tears down on PR close Scripts - GenerateJwtKeys.ts: emit JWT_REALTIME_JWKS (EC-only) alongside the full JWT_PUBLIC_JWKS so realtime gets its own bundle - export-staging-env.sh: pull staging secrets from bao to .env.local for local dev (includes E2E_ENABLE / END_TO_END_SECRET / EDGE_FUNCTION_SECRET) Test - tests/e2e/grading.test.tsx: ! non-null assert on submission_id Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

workflow_dispatch needs the workflow on the default branch first; until helm is merged, allow direct pushes to helm to deploy a preview at pr-helm.preview.pawtograder.net so we can iterate without opening a PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Helm chart templates use Go-template directives + multi-document YAML that prettier's YAML parser can't handle (treats `---` as trailing content, `{{- ... }}` as malformed scalars). Reformat preview.yml. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Next.js inlines NEXT_PUBLIC_SUPABASE_ANON_KEY into the client bundle at build time, so the empty value we were passing made the build assert during prerender of the error/_not-found pages. And even if we slipped a placeholder in, runtime client requests would 401 against the autogenerated ANON_KEY in the cluster. Move key generation in front of the build: - New `keys` job runs `scripts/GenerateJwtKeys.ts --env`, masks each value, exposes them as job outputs (lower-cased) - build-web takes anon_key as a build arg - deploy disables `secrets.autogenerate` and passes every required value via `--set-string secrets.values.*` so build/runtime keys agree Chart support: - GenerateJwtKeys.ts --env now also emits PGSODIUM_ROOT_KEY, POSTGRES/PAWTOGRADER passwords, END_TO_END_SECRET, EDGE_FUNCTION_SECRET (matches what the in-cluster bootstrap script would have generated). - secrets.values now has slots for jwt.{realtimeJwks,realtimeEncKey, pgMetaCryptoKey,pgsodiumRootKey} and e2e.{endToEndSecret, edgeFunctionSecret}; templates/secrets.yaml renders all of them. - values-preview.yaml: edgeFunctions.envFromSecrets includes pawtograder-e2e (rendered now that secrets.create=true). Workflow secrets needed: KUBECONFIG_BASE64 (already set), MINIO_PREVIEW_ACCESS_KEY + MINIO_PREVIEW_SECRET_KEY (new — for the shared previews bucket on s3.talos.ripley.cloud). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

GHA-scoped cache walls each branch off (only main is fallback for other branches). Switch web/edge-functions/migrations cache to type=registry against registry.work.ripley.cloud — the runner pods now seed ~/.docker/config.json with cluster-registry auth via an init container (see k8s/apps/arc/values-pawtograder-ci.yaml in the k8s repo), so no docker/login-action change is needed. Cache images: registry.work.ripley.cloud/pawtograder/{web,edge-functions,migrations}-buildcache:latest Cross-branch hits work without main-fallback gymnastics; pulls stay in-cluster instead of going to GHA's CDN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per-PR previews keep the same namespace across pushes, so generating fresh keys every time would rotate the DB password / anon key out from under the running pods on every commit. Change `keys` job: - First push to a PR: run GenerateJwtKeys.ts and emit fresh values. - Subsequent pushes: read the existing pawtograder-{jwt,postgres,e2e} Secrets from the namespace and emit the same values back. Either path feeds build-web (NEXT_PUBLIC_SUPABASE_ANON_KEY inlined) and deploy (`--set-string secrets.values.*`). Helm sees identical Secret contents on upgrade — idempotent. The migrations Job in the chart is already named with `-{{ .Release.Revision }}`, so a new Job runs every upgrade and only-new migrations are applied. The keys job now needs cluster read access; reuses KUBECONFIG_BASE64. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

GH Actions strips job outputs whose values match a registered secret mask, so the previous keys-job-outputs approach lost everything via "Skip output 'anon_key' since it may contain secret" warnings, leaving the web build with NEXT_PUBLIC_SUPABASE_ANON_KEY=. Restructure: - rename `keys` -> `secrets`. Job ensures the namespace exists and creates pawtograder-{jwt,postgres,e2e,s3} via kubectl on first install. Re-runs leave existing Secrets alone (PR sync = same DB password / anon key persists). - build-web reads ANON_KEY straight from the namespace via kubectl, stashes in $GITHUB_ENV (single-job env, not subject to the cross-job output stripping), passes as build arg. - deploy uses helm `--set secrets.create=false` so the chart consumes the existing Secrets rather than rendering its own. Drops all the --set-string flags that were doing nothing because the masked-output values never made it across jobs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

talos.ripley.cloud:6443 isn't reachable from GH-hosted ubuntu-latest runners (LB is private). Switch secrets/deploy/destroy onto the in-cluster runner pool, which can hit the apiserver via hairpin NAT through the LB. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-29T10:37:49Z

Warning

Rate limit exceeded

@jon-bell has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 23 minutes and 5 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 38062b67-dad8-49fa-847b-0482073fe0fc

📥 Commits

Reviewing files that changed from the base of the PR and between 9fc091b and a1d4877.

📒 Files selected for processing (24)

.github/workflows/preview.yml
.github/workflows/release-images.yml
Dockerfile
charts/pawtograder/README.md
charts/pawtograder/examples/values-preview.yaml
charts/pawtograder/images/edge-functions/main.ts
charts/pawtograder/images/migrations/Dockerfile
charts/pawtograder/images/migrations/migrate.sh
charts/pawtograder/images/migrations/seed.sh
charts/pawtograder/templates/backup.yaml
charts/pawtograder/templates/edge-functions.yaml
charts/pawtograder/templates/ingress.yaml
charts/pawtograder/templates/postgres-statefulset.yaml
charts/pawtograder/templates/realtime.yaml
charts/pawtograder/templates/secrets-bootstrap-script.yaml
charts/pawtograder/templates/secrets.yaml
charts/pawtograder/templates/seed-job.yaml
charts/pawtograder/templates/supavisor.yaml
charts/pawtograder/values.yaml
scripts/GenerateJwtKeys.ts
scripts/export-preview-env.sh
scripts/export-staging-env.sh
tests/performance/db-tps-http.ts
tests/performance/k6-supabase.ts

Walkthrough

This PR introduces a comprehensive self-hosted deployment framework: a production-ready Helm chart for Pawtograder (with PostgreSQL, Supabase services, and supporting infrastructure), GitHub Actions workflows for per-PR preview deployments and release builds, performance testing infrastructure (k6 load tests for submissions and realtime), TypeScript utilities for cryptographic key generation, and enhanced Docker build resilience with retry logic. E2E testing support is wired throughout.

Changes

Cohort / File(s)	Summary
GitHub Actions CI/CD `.github/workflows/preview.yml`, `.github/workflows/release-images.yml`	New workflows for ephemeral per-PR preview environments (namespace provisioning, secret generation, Helm deployment, sticky PR comments) and release image/chart publishing to GHCR with multi-image tagging.
Helm Chart Core `charts/pawtograder/Chart.yaml`, `charts/pawtograder/.helmignore`, `charts/pawtograder/README.md`, `charts/pawtograder/values.yaml`, `charts/pawtograder/examples/values-preview.yaml`	Chart metadata, documentation, comprehensive default values, and preview-optimized example configuration with resource/replica tuning for ephemeral deployments.
Helm Chart Templates (Core Services) `charts/pawtograder/templates/postgres-.yaml`, `charts/pawtograder/templates/supavisor.yaml`, `charts/pawtograder/templates/auth.yaml`, `charts/pawtograder/templates/rest.yaml`, `charts/pawtograder/templates/realtime.yaml`, `charts/pawtograder/templates/kong.yaml`	Kubernetes manifests for PostgreSQL StatefulSet, connection pooling, authentication, REST API, realtime, and API gateway with complex initialization, secret injection, and clustering configuration.
Helm Chart Templates (Supporting Services) `charts/pawtograder/templates/migrations-job.yaml`, `charts/pawtograder/templates/web.yaml`, `charts/pawtograder/templates/storage.yaml`, `charts/pawtograder/templates/studio.yaml`, `charts/pawtograder/templates/meta.yaml`, `charts/pawtograder/templates/edge-functions.yaml`, `charts/pawtograder/templates/imgproxy.yaml`	Deployment templates for database migrations, web app, storage API, Studio backend, metadata service, edge functions runtime, and image proxy with environment wiring and persistence configuration.
Helm Chart Templates (Infrastructure & Security) `charts/pawtograder/templates/ingress.yaml`, `charts/pawtograder/templates/ingress-studio.yaml`, `charts/pawtograder/templates/secrets*.yaml`, `charts/pawtograder/templates/serviceaccount.yaml`, `charts/pawtograder/templates/_helpers.tpl`	Ingress routing, secret management (creation/bootstrap with cryptographic key generation), service account setup, and comprehensive helper functions for naming, networking, and resource templating.
Helm Chart Templates (Operations) `charts/pawtograder/templates/audit-partitions.yaml`, `charts/pawtograder/templates/backup.yaml`	Optional scheduled CronJobs for Postgres audit partition maintenance and S3-backed database backups with lifecycle policies.
Container Images `charts/pawtograder/images/edge-functions/Dockerfile`, `charts/pawtograder/images/edge-functions/main.ts`, `charts/pawtograder/images/migrations/Dockerfile`, `charts/pawtograder/images/migrations/migrate.sh`	New Docker images for edge-function runtime (Deno-based request router) and SQL migrations executor with schema initialization and idempotent execution tracking.
Docker & Build `Dockerfile`	Enhanced dependency installation with configurable npm network retries and exponential backoff (up to 3 attempts with 10-second delays) to tolerate transient download failures.
Performance & Load Testing `tests/performance/submissions-write-storm.ts`, `tests/performance/k6-supabase.ts`, `webpack.k6.config.js`, `package.json`	New k6 load test (`submissions-write-storm`) with concurrent realtime subscriber and submission-create/feedback-submit scenarios; generalized test helpers supporting multiple user roles and synthetic feedback generation; webpack configuration for test bundling.
E2E & Edge Function Testing `supabase/functions/autograder-create-submission/index.ts`, `supabase/functions/autograder-submit-feedback/index.ts`, `tests/e2e/grading.test.tsx`	E2E mode support in edge functions (conditional GitHub mock short-circuits); grading test fix enforcing non-null `submission_id` during polling.
Scripts & Configuration `scripts/GenerateJwtKeys.ts`, `scripts/export-staging-env.sh`, `.prettierignore`, `tsconfig.json`	Cryptographic key/secret generation script (EC P-256, HS256, symmetric keys) with multiple output formats (plain, env, Helm values, KV store); staging environment exporter from OpenBao; Prettier/TypeScript config exclusions for Helm and performance tests.

Sequence Diagram(s)

sequenceDiagram
    actor GH as GitHub (Event)
    participant GHA as GitHub Actions
    participant GHCR as GHCR Registry
    participant K8s as Kubernetes Cluster
    participant Helm as Helm Chart
    
    GH->>GHA: PR opened/updated
    GHA->>GHA: Derive preview ID & hostname
    GHA->>GHA: Compute image tags & K8s namespace
    GHA->>K8s: Create namespace
    GHA->>GHA: Generate secrets (JWT, passwords)
    GHA->>K8s: Apply secrets to namespace
    GHA->>GHA: Build web/edge-functions/migrations images
    GHA->>GHCR: Push images with preview tags
    GHA->>Helm: Helm install pawtograder chart
    Helm->>K8s: Apply manifests (Deployments, Services, StatefulSets)
    K8s->>K8s: Initialize Postgres, run migrations
    GHA->>K8s: Poll migrations job completion
    GHA->>GH: Post sticky PR comment (app/API links, kubectl snippets)

sequenceDiagram
    participant K6 as k6 Script
    participant Setup as Setup Phase
    participant Sub as Realtime Subscribers
    participant Storm as Submission Storm
    participant Feedback as Feedback Submitters
    participant API as Edge Functions / Database
    
    K6->>Setup: execute setup()
    Setup->>API: Create test class, assignments, users
    Setup->>API: Exchange magic links for tokens
    Setup-->>K6: Return StormSetupData
    
    par Concurrent Scenarios
        K6->>Sub: ramp realtime subscribers
        Sub->>API: Join Phoenix topics (class/user)
        Sub->>Sub: Measure fanout latency & message counts
    and
        K6->>Storm: ramp arrival submission requests
        Storm->>API: create-submission per iteration
        Storm->>API: submit-feedback (synthetic results)
        Storm->>Storm: Record durations & success rates
    end
    
    K6->>K6: Evaluate thresholds
    K6->>Setup: execute teardown()

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

The PR spans heterogeneous domains (CI/CD workflows, 2000+ lines of Helm infrastructure, performance test framework, edge function modifications, Docker build changes) with significant logic density in templating, Kubernetes resource configuration, k6 scenario design, and secret generation. Comprehensive validation of security contexts, environment variable injection, networking topology, persistence strategies, and conditional rendering across 40+ template files is required.

Possibly related PRs

Tolerate missing check runs at submission time #635: Modifies supabase/functions/autograder-create-submission/index.ts to add E2E mock submission handling alongside check-run fallback logic.
Fixes CI preview deployment to work with AWS + GitHub secrets #113: Modifies preview/deployment CI workflows and secret provisioning infrastructure in .github/workflows/, directly overlapping deployment strategy.
Empty submission detection #570: Modifies supabase/functions/autograder-create-submission/index.ts with E2E short-circuit changes and file-hash computation affecting the same submission creation flow.

Suggested labels

devops, testing

Poem

🚀 From GitHub to cluster, a chart takes flight,
Secrets auto-sealed, migrations burn bright,
Realtime subscribers storm the wire,
Edge functions answer without delay or tire,
Infrastructure blooms—predictable, clean, and right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 47.62% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main change: introduction of Helm charts for deployment and deploy previews, which is the primary focus of this large PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 23 minutes and 5 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

+  const headerEncoded = b64url(JSON.stringify(header));
+  const payloadEncoded = b64url(JSON.stringify(payload));
+  const data = `${headerEncoded}.${payloadEncoded}`;
+  const sig = b64url(createHmac("sha256", secret).update(data).digest());


argos-ci · 2026-04-29T10:56:39Z

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build	Status	Details	Updated (UTC)
default (Inspect)	⚠️ Changes detected (Review)	126 changed, 1 failure	Apr 29, 2026, 3:23 PM

github-actions · 2026-04-29T11:02:29Z

🚀 Preview deployed for pr-741-9fc091b

App: https://pr-741.preview.pawtograder.net
API: https://api.pr-741.preview.pawtograder.net
Studio: kubectl -n pawtograder-preview-pr-741 port-forward svc/pawtograder-studio 3010:3000
Logs: kubectl -n pawtograder-preview-pr-741 logs -l app.kubernetes.io/instance=pawtograder

Cert may take ~1 min to issue. The preview tears down automatically when this PR closes.

jon-bell · 2026-04-29T14:27:14Z

@coderabbitai review

coderabbitai · 2026-04-29T14:27:21Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 20

🧹 Nitpick comments (15)

.prettierignore (1)

49-52: Consider whether the ignore pattern could be more specific.

The charts/ directory contains plain YAML files without Go templates (Chart.yaml, values.yaml, examples/values-preview.yaml) that could be formatted by Prettier, alongside the template files that would break the parser. A more targeted pattern like charts/*/templates/ would exclude only the template files while allowing plain YAML to be formatted.

That said, ignoring the entire charts/ directory is reasonable if Helm charts should be treated as a cohesive unit and left unformatted for consistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.prettierignore around lines 49 - 52, Update the .prettierignore entry that
currently ignores the entire charts/ directory to a more specific pattern so
plain chart YAML (e.g., Chart.yaml, values.yaml, examples/values-preview.yaml)
can be formatted while template files remain ignored; replace or augment the
existing "charts/" rule with a targeted pattern such as excluding only template
directories (for example use a pattern like charts/*/templates/ or similar) so
Prettier will still skip Helm templates but will allow non-template chart files
to be formatted.

charts/pawtograder/images/edge-functions/Dockerfile (1)

5-5: Pin the base image by digest for reproducible builds

supabase/edge-runtime:v1.69.5 can be retagged. Pinning with @sha256:... prevents drift across rebuilds.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/images/edge-functions/Dockerfile` at line 5, Update the
Dockerfile's base image reference so builds are reproducible by replacing the
mutable tag used in the FROM statement (currently
"supabase/edge-runtime:v1.69.5") with the corresponding immutable digest form
("supabase/edge-runtime@sha256:..."); locate the FROM line in the Dockerfile and
pin it to the exact sha256 digest for v1.69.5 (or the specific digest you
verify), then run a test rebuild to confirm the digest is correct and that the
image pulls successfully.

charts/pawtograder/images/edge-functions/main.ts (1)

31-32: Replace @ts-ignore with @ts-expect-error

Use the stricter directive so suppression fails loudly once typings are available.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/images/edge-functions/main.ts` around lines 31 - 32,
Replace the loose suppression comment above the EdgeRuntime.userWorkers.create
call: remove the existing //@ts-ignore and use //@ts-expect-error instead so
TypeScript will error once proper typings exist; ensure the `@ts-expect-error` is
placed immediately above the line invoking EdgeRuntime.userWorkers.create (the
same spot where the current suppression lives) and run type-check to verify no
unexpected failures.

scripts/GenerateJwtKeys.ts (2)

36-43: Prefer a vetted JOSE library over manual JWT construction

This path is security-critical, and hand-rolled JWT assembly/signing is brittle over time. Using jose reduces risk and maintenance burden.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/GenerateJwtKeys.ts` around lines 36 - 43, The signHS256 function
manually builds and signs JWTs which is brittle and unsafe; replace its manual
implementation with a vetted JOSE library (e.g., 'jose') by using the library's
SignJWT (or equivalent) to set the payload, call setProtectedHeader({ alg:
'HS256', kid }), and sign with the appropriate HMAC secret key so the function
returns the compact serialized JWT; update imports and tests to use the JOSE
signing flow and remove the manual b64url/HMAC assembly in signHS256.

82-85: safePass can return shorter passwords than intended

Removing /+= characters shrinks entropy/length variably before slicing. Use a URL-safe encoding that preserves deterministic length.

Proposed fix

-const safePass = (n: number) => randomBytes(n).toString("base64").replace(/[/+=]/g, "").slice(0, 32);
+const safePass = (n: number) => randomBytes(n).toString("base64url").slice(0, 32);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/GenerateJwtKeys.ts` around lines 82 - 85, safePass currently
base64-encodes random bytes then strips / + = characters which can make results
shorter than requested; update safePass to compute the number of random bytes
needed for a deterministic output length (bytes = Math.ceil(desiredLength * 3 /
4)), generate that many bytes, perform URL-safe base64 encoding by replacing '+'
with '-' and '/' with '_' and removing padding '=', then slice to the requested
length; apply this new safePass (used to produce postgresPassword and
pawtograderPassword) so the returned passwords always have the exact requested
length and use a URL-safe alphabet.

charts/pawtograder/templates/secrets.yaml (1)

43-43: Consider using a configurable secret name for consistency.

The e2e secret name is hardcoded as pawtograder-e2e while other secrets use .Values.secrets.names.*. This could cause confusion if someone tries to customize secret names.
♻️ Suggested refactor for consistency

Add to values.yaml under secrets.names:
secrets:
  names:
    e2e: pawtograder-e2e
Then update the template:
  metadata:
-   name: pawtograder-e2e
+   name: {{ .Values.secrets.names.e2e }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/templates/secrets.yaml` at line 43, The secret name is
hardcoded as "pawtograder-e2e"; change it to use the chart value
(.Values.secrets.names.e2e) so the E2E secret follows the same configurable
pattern as other secrets. Add an e2e key under secrets.names in values.yaml
(e.g., secrets.names.e2e) and update the template that currently sets name:
pawtograder-e2e to reference .Values.secrets.names.e2e (with a sensible default
fallback if your chart uses defaulting logic).

charts/pawtograder/templates/migrations-job.yaml (1)

118-124: Use migrations scheduling values instead of postgres values.

Line 118 and Line 122 pass .Values.postgres into nodeSelector/tolerations helpers. That couples migration pod placement to postgres settings and blocks independent migrations scheduling.

Proposed diff

-      {{- with (include "pawtograder.nodeSelector" (dict "ctx" . "component" .Values.postgres)) }}
+      {{- with (include "pawtograder.nodeSelector" (dict "ctx" . "component" .Values.migrations)) }}
       nodeSelector:
         {{- . | nindent 8 }}
       {{- end }}
-      {{- with (include "pawtograder.tolerations" (dict "ctx" . "component" .Values.postgres)) }}
+      {{- with (include "pawtograder.tolerations" (dict "ctx" . "component" .Values.migrations)) }}
       tolerations:
         {{- . | nindent 8 }}
       {{- end }}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/templates/migrations-job.yaml` around lines 118 - 124, The
nodeSelector and tolerations helpers in migrations-job are currently being
passed .Values.postgres which ties migration pod placement to Postgres; update
the two include calls that pass the dict with "component" .Values.postgres to
instead pass "component" .Values.migrations so pawtograder.nodeSelector and
pawtograder.tolerations receive the migrations scheduling values (change the
include invocations around pawtograder.nodeSelector and
pawtograder.tolerations).

charts/pawtograder/templates/meta.yaml (1)

34-76: Consider adding nodeSelector/tolerations support for meta as well.

Most other component deployments in this chart expose scheduling controls; meta currently does not. Adding the same helper blocks improves operability consistency.

Proposed diff

           resources:
             {{- toYaml .Values.meta.resources | nindent 12 }}
+      {{- with (include "pawtograder.nodeSelector" (dict "ctx" . "component" .Values.meta)) }}
+      nodeSelector:
+        {{- . | nindent 8 }}
+      {{- end }}
+      {{- with (include "pawtograder.tolerations" (dict "ctx" . "component" .Values.meta)) }}
+      tolerations:
+        {{- . | nindent 8 }}
+      {{- end }}
 {{- end }}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/templates/meta.yaml` around lines 34 - 76, Add scheduling
controls for the meta Deployment by wiring nodeSelector, tolerations, and
affinity into the spec like other components: update the spec for the meta
container (in charts/pawtograder/templates/meta.yaml) to include the same helper
calls or value references used elsewhere (e.g., .Values.meta.nodeSelector /
.Values.meta.tolerations / .Values.meta.affinity or the chart helper templates
such as include "pawtograder.nodeSelector" and include
"pawtograder.tolerations") immediately under the spec (alongside
serviceAccountName and imagePullSecrets) so meta can be scheduled consistent
with the other components.

charts/pawtograder/images/migrations/migrate.sh (1)

43-44: Consider using mapfile for safer array population.

Shellcheck flags SC2207 here. While migration filenames following the <timestamp>_<name>.sql convention shouldn't contain problematic characters, using mapfile is more robust.

♻️ Proposed fix using mapfile

-IFS=$'\n' sorted=( $(printf '%s\n' "${files[@]}" | sort) )
-unset IFS
+mapfile -t sorted < <(printf '%s\n' "${files[@]}" | sort)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/images/migrations/migrate.sh` around lines 43 - 44, The
current population of the sorted array using IFS and command substitution
(IFS=$'\n' sorted=( $(printf '%s\n' "${files[@]}" | sort) ); unset IFS) is
flagged by ShellCheck SC2207; replace that pattern with a safer mapfile approach
by reading the sorted printf output into the array via mapfile -t sorted <
<(printf '%s\n' "${files[@]}" | sort) (keeping the same variables: files and
sorted) so array entries are not subject to word-splitting or globbing issues.

charts/pawtograder/templates/supavisor.yaml (1)

95-101: Consider adding a liveness probe.

The deployment has a readiness probe but no liveness probe. For a connection pooler, a liveness probe helps detect hung processes and triggers restarts.

♻️ Proposed liveness probe

           readinessProbe:
             tcpSocket:
               port: pool
             initialDelaySeconds: 10
             periodSeconds: 5
+          livenessProbe:
+            tcpSocket:
+              port: pool
+            initialDelaySeconds: 30
+            periodSeconds: 10
+            failureThreshold: 3
           resources:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/templates/supavisor.yaml` around lines 95 - 101, Add a
livenessProbe next to the existing readinessProbe in the supavisor container
template to detect hung processes and trigger restarts; mirror the
readinessProbe tcpSocket port "pool" but configure a more conservative timing
(for example initialDelaySeconds ~30, periodSeconds ~10, failureThreshold ~3) so
transient startup doesn't cause restarts, and place it alongside readinessProbe
and resources in the supavisor template that references .Values.supavisor.

charts/pawtograder/templates/kong.yaml (2)

39-87: Kong container missing explicit security context.

The main Kong container lacks a securityContext. Adding one would align with the hardening applied to other workloads in this chart.

♻️ Proposed security context

         - name: kong
           image: {{ include "pawtograder.image" (dict "ctx" . "image" .Values.kong.image) }}
           imagePullPolicy: {{ .Values.kong.image.pullPolicy }}
+          securityContext:
+            allowPrivilegeEscalation: false
+            capabilities: { drop: ["ALL"] }
           ports:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/templates/kong.yaml` around lines 39 - 87, Add a
securityContext block to the Kong container definition (the container with name
"kong") to match the chart hardening: set runAsUser to a non-root UID,
runAsGroup, ensure readOnlyRootFilesystem: true, allowPrivilegeEscalation:
false, and drop all capabilities (with any needed adds explicitly listed), and
add seccompProfile: { type: RuntimeDefault } and fsGroup at the pod level if not
present; place the securityContext under the same container object that contains
ports/env/volumeMounts/readinessProbe so the container runs non-root and with
restricted privileges.

91-118: Add security context to initContainer and pin Alpine version.

The render-config initContainer lacks a security context and uses an unpinned alpine:3.20 tag. For consistency with the bootstrap job's hardening and reproducible builds, consider adding constraints and pinning the patch version.

♻️ Proposed hardening

       initContainers:
         - name: render-config
-          image: alpine:3.20
+          image: alpine:3.20.0
+          securityContext:
+            allowPrivilegeEscalation: false
+            capabilities: { drop: ["ALL"] }
+            readOnlyRootFilesystem: false  # needs to write to /rendered
+            runAsNonRoot: true
+            runAsUser: 65532
           command:

Note: readOnlyRootFilesystem must remain false since the container writes to the mounted /rendered emptyDir.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/templates/kong.yaml` around lines 91 - 118, The
initContainer "render-config" currently uses an unpinned image and has no
securityContext; update the image tag to a specific pinned patch (e.g.,
alpine:3.20.<patch>) and add a securityContext on the initContainer that
enforces least-privilege: runAsNonRoot: true with an explicit runAsUser (e.g.,
1000), drop all capabilities, set readOnlyRootFilesystem: false (must remain
false because /rendered is written), set a restrictive seccompProfile
(RuntimeDefault) and a minimal procMount if desired; apply these changes to the
initContainer definition named render-config so it matches the chart hardening
standards.

charts/pawtograder/templates/audit-partitions.yaml (1)

25-47: Missing resource limits and security context.

The psql container lacks resource requests/limits and security hardening. While this is a short-lived job, adding these improves consistency and cluster hygiene.

♻️ Proposed additions

             - name: psql
               image: {{ include "pawtograder.image" (dict "ctx" . "image" .Values.auditPartitions.image) }}
+              securityContext:
+                allowPrivilegeEscalation: false
+                capabilities: { drop: ["ALL"] }
+                readOnlyRootFilesystem: true
               command:
                 - psql
                 - -v
                 - ON_ERROR_STOP=1
                 - -c
                 - "SELECT public.audit_maintain_partitions();"
+              resources:
+                requests: { cpu: "50m", memory: "64Mi" }
+                limits:   { cpu: "200m", memory: "128Mi" }
               env:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/templates/audit-partitions.yaml` around lines 25 - 47, The
psql container definition (container name "psql") is missing resource
requests/limits and a securityContext; add a resources block with sensible
CPU/memory requests and limits for a short-lived job and add a securityContext
on the container to enforce runAsNonRoot/runAsUser, set
allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, drop all
capabilities (and explicitly drop NET_RAW/ALL as appropriate), and set a
seccompProfile (RuntimeDefault) to harden execution; place these blocks under
the same container entry where "psql" is defined and ensure they follow Helm
templating conventions used elsewhere in the chart.

charts/pawtograder/templates/storage.yaml (1)

86-91: Minor: S3 env vars set even when backend=file.

REGION and GLOBAL_S3_BUCKET are always populated regardless of the storage.backend value. This is harmless (the storage-api ignores them when STORAGE_BACKEND=file), but you could wrap them in {{- if eq .Values.storage.backend "s3" }} for cleaner intent if desired.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/templates/storage.yaml` around lines 86 - 91, REGION and
GLOBAL_S3_BUCKET env vars are emitted unconditionally even when storage.backend
is "file"; wrap the two env entries (the lines setting name: REGION and name:
GLOBAL_S3_BUCKET that use .Values.storage.s3.region and
.Values.storage.s3.bucket) in a conditional block using {{- if eq
.Values.storage.backend "s3" }} ... {{- end }} so they are only rendered for the
s3 backend.

charts/pawtograder/templates/secrets-bootstrap-script.yaml (1)

124-124: Realtime encryption key truncation may reduce entropy.

randomBytes(16).toString("base64") yields ~22 characters, but .slice(0, 16) discards the trailing chars. You're effectively keeping only 12 bytes of entropy (16 base64 chars encode 12 bytes). If the downstream consumer genuinely expects a 16-byte AES-128 key, consider:
-    const realtimeEncKey  = randomBytes(16).toString("base64").slice(0, 16);
+    const realtimeEncKey  = randomBytes(12).toString("base64");  // exactly 16 chars
Or, if it needs raw 16 bytes encoded differently, adjust accordingly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@charts/pawtograder/templates/secrets-bootstrap-script.yaml` at line 124, The
current assignment to realtimeEncKey uses
randomBytes(16).toString("base64").slice(0, 16) which truncates the base64
string and reduces entropy; instead generate and preserve the full 16 raw bytes
and encode them in a way the consumer expects: either keep the full base64
encoding of randomBytes(16) (do not slice) or export the raw 16 bytes
(randomBytes(16)) and encode to hex/base64 at the contract boundary; update the
code that reads realtimeEncKey accordingly so the downstream consumer receives a
true 16-byte AES-128 key rather than a truncated string.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/preview.yml:
- Around line 157-160: The check currently only verifies pawtograder-jwt; change
the if condition that runs kubectl -n "$NS" get secret pawtograder-jwt to verify
the full set of namespace secrets by checking all four names (pawtograder-jwt,
pawtograder-postgres, pawtograder-e2e, pawtograder-s3) in a single kubectl get
call (e.g., kubectl -n "$NS" get secret pawtograder-jwt pawtograder-postgres
pawtograder-e2e pawtograder-s3 >/dev/null 2>&1) so the script only exits early
when every required secret exists.
- Around line 346-357: The "Wait for migrations to complete" step currently
swallows a timed-out kubectl wait by appending "|| { ... }" which prints a
warning but allows the job to succeed; update the logic so that kubectl -n "${{
needs.meta.outputs.namespace }}" wait --for=condition=Complete --timeout=10m job
-l app.kubernetes.io/component=migrations does not get masked on failure: if the
wait fails, print debugging info (kubectl get pods,jobs) and then exit non‑zero
(exit 1) so the workflow fails and no "Preview deployed" is reported; remove or
replace the trailing "|| { ... }" block accordingly.

In @.github/workflows/release-images.yml:
- Around line 89-120: The Resolve hostnames step currently falls back to staging
URLs when inputs.web_hostname/inputs.api_hostname are empty, causing tag/branch
builds to bake staging values; update that step to detect the event type
(GITHUB_EVENT_NAME) and only use staging defaults for workflow_dispatch or when
explicitly passed via inputs, otherwise set production URLs (e.g.,
https://pawtograder.net and https://api.pawtograder.net), then continue
exporting them as steps.hosts.outputs.web and .api so the Build & push web image
build-args (NEXT_PUBLIC_PAWTOGRADER_WEB_URL, NEXT_PUBLIC_SUPABASE_URL,
SUPABASE_URL) receive the correct hostnames for non-dispatch runs.
- Around line 61-68: The current else branch treats any non-tag ref the same,
causing pushes to the helm branch to be published under main-${short_sha} and
main-latest; update the conditional logic that sets version/is_tag/tags (the
GITHUB_REF check and the tags variable assignment) to distinguish
refs/heads/helm from refs/heads/main: if GITHUB_REF is refs/heads/helm, set
version and tags to use a helm-${short_sha} and helm-latest (or skip
publishing), otherwise keep the main-${short_sha}/main-latest behavior for
refs/heads/main; adjust the else into explicit branch cases so helm commits no
longer use main-* tags.

In `@charts/pawtograder/examples/values-preview.yaml`:
- Around line 94-98: There are two separate top-level storage: blocks so the
latter silently overwrites the former; merge them into one storage block that
includes storage.replicas and storage.resources (requests/limits) plus any
fields from the second storage block so nothing is lost — update the single
storage mapping (the storage: block that defines replicas, resources, and any
other storage-related keys) to contain the union of both definitions ensuring
storage.replicas and storage.resources are preserved.

In `@charts/pawtograder/images/edge-functions/main.ts`:
- Around line 14-26: serviceName is used directly to build servicePath allowing
path-traversal (e.g. ".."); validate and sanitize serviceName before
constructing servicePath by enforcing a strict pattern (e.g. /^[A-Za-z0-9_-]+$/)
or using a safe basename operation, reject requests that fail validation with a
400, and only then set servicePath = `/home/deno/functions/${serviceName}`;
update the logic around the URL parsing/serviceName check in main.ts so
validation happens prior to building servicePath.

In `@charts/pawtograder/images/migrations/migrate.sh`:
- Around line 50-58: The SELECT/INSERT lines interpolate filename-derived
variables ${version} and ${name} directly into SQL (in migrate.sh), risking SQL
injection; instead pass them to psql via --set/-v and use psql's literal-quoting
variable syntax so psql performs safe quoting: call psql with -v
version="${version}" -v name="${name}" for both the exists check (psql -tAc ...)
and the INSERT (psql -c ...), replace occurrences of '${version}' and the manual
name escaping (${name//\'/\'\'}) in the SQL with the psql-quoted forms (e.g. use
:'version' and :'name' in the SQL) so values are safely quoted by psql rather
than string-interpolated by the shell.

In `@charts/pawtograder/README.md`:
- Around line 54-63: Update the README secrets table so the `pawtograder-jwt`
entry lists all keys actually required by the chart: add `JWT_PRIVATE_JWKS`,
`JWT_REALTIME_JWKS`, `REALTIME_ENC_KEY`, and `PGSODIUM_ROOT_KEY` (in addition to
`JWT_SECRET`, `ANON_KEY`, `SERVICE_ROLE_KEY`), since `auth.yaml`,
`realtime.yaml`, and `postgres-statefulset.yaml` read those values; ensure the
table reflects the keys needed when `secrets.create=false` so deployments aren’t
left missing required secrets.
- Around line 64-65: Update the README.md reference to the new script name so
readers can find the JWT helper; replace the stale mention of
"scripts/generate-jwt.ts" with the actual script "scripts/GenerateJwtKeys.ts"
(or the correct casing used in the repo) where it instructs to generate ANON_KEY
and SERVICE_ROLE_KEY for JWT_SECRET so case-sensitive checkouts won't break.

In `@charts/pawtograder/templates/backup.yaml`:
- Around line 36-38: Replace the runtime download of mc (the apt-get/curl/chmod
block) by baking the mc binary into the image: remove the curl/download and
chmod commands from the backup job template and instead add mc to the container
image via your Dockerfile (or switch to a base image that already contains mc),
ensuring you fetch mc at build time and verify its integrity
(checksum/signature) during image build; update any references in the backup job
template (the block that currently runs "apt-get update && apt-get install -y
...", the "curl -fsSL ... -o /usr/local/bin/mc", and "chmod +x
/usr/local/bin/mc") so the job assumes mc is already present in the image.

In `@charts/pawtograder/templates/edge-functions.yaml`:
- Around line 71-77: The env var SUPABASE_DB_URL currently references
$(POSTGRES_PASSWORD) before POSTGRES_PASSWORD is declared, so move the
POSTGRES_PASSWORD env entry (the block with name: POSTGRES_PASSWORD and its
valueFrom: secretKeyRef) to appear before the SUPABASE_DB_URL entry so
Kubernetes will expand $(POSTGRES_PASSWORD) when constructing SUPABASE_DB_URL;
update the env list ordering in the edge-functions.yaml template where
SUPABASE_DB_URL and POSTGRES_PASSWORD are defined.

In `@charts/pawtograder/templates/ingress.yaml`:
- Around line 24-26: The ingress template adds TLS host entries for
.Values.ingress.extraHosts but doesn’t create corresponding HTTP rule blocks, so
those hostnames won’t route traffic; update the ingress template to iterate over
.Values.ingress.extraHosts (same list used in the TLS section) and create
matching rules entries (host + http.paths/backend/serviceName/servicePort or the
existing path structure) for each extraHost, or remove the extraHosts iteration
entirely if you don’t intend to expose routes for them.

In `@charts/pawtograder/templates/postgres-statefulset.yaml`:
- Around line 81-84: The StatefulSet always includes a volumeMount named "data"
under volumeMounts which is only created via volumeClaimTemplates when
persistence is enabled, so when persistence=false the Pod references a
non-existent volume; update the template logic around the volumeMounts and
volumes/volumeClaimTemplates to conditionally include the "data" volumeMount and
its corresponding volume/volumeClaimTemplates based on the persistence flag
(refer to the volumeMounts section containing "- name: data" and the
volumeClaimTemplates block) so that when persistence=false the "data" mount is
omitted or replaced with an emptyDir fallback.
- Around line 105-126: The readinessProbe and livenessProbe currently call
pg_isready as user "postgres" which doesn't exist until migrate.sh runs; change
those probe commands to check the bootstrap/init role instead (use the template
value that represents the initial DB user, e.g. replace the hardcoded "-
postgres" with "- {{ .Values.postgres.bootstrapUser }}" or the appropriate
.Values key that holds the bootstrap/init user), so readinessProbe and
livenessProbe use pg_isready -U <bootstrap-user> -d {{ .Values.postgres.database
}} until migrate.sh creates the final "postgres" user.

In `@charts/pawtograder/templates/realtime.yaml`:
- Around line 177-189: The branch that handles
.Values.realtime.spreadAcrossNodes currently calls {{- with (include
"pawtograder.affinity" (dict "ctx" . "component" .Values.realtime)) }} but that
pattern swallows the included template output; change it to capture and render
the included affinity template (e.g., assign the include result to a variable or
use an if to check non-empty) so the user-supplied pawtograder.affinity is
emitted and can override the podAntiAffinity block; target the include
invocation of "pawtograder.affinity" and the surrounding spreadAcrossNodes
conditional and ensure the rendered affinity is nindented to match the
surrounding YAML structure.

In `@charts/pawtograder/templates/supavisor.yaml`:
- Around line 71-90: The four environment variables SECRET_KEY_BASE,
VAULT_ENC_KEY, API_JWT_SECRET and METRICS_JWT_SECRET currently all reference the
same secret key JWT_SECRET; update the template in supavisor.yaml so each env
var uses its own secret key name (e.g., map SECRET_KEY_BASE ->
JWT_SECRET_KEY_BASE, VAULT_ENC_KEY -> VAULT_ENC_KEY, API_JWT_SECRET ->
API_JWT_SECRET, METRICS_JWT_SECRET -> METRICS_JWT_SECRET) and then add
corresponding keys in your Helm values (values.yaml) so each secret can supply
the correct length/format for createSecretRef lookups. Ensure you only change
the secretKeyRef.key names for those variables (not the env var names) so each
variable reads from its distinct secret key.

In `@Dockerfile`:
- Around line 9-12: The retry loop around `npm ci` currently masks failures
because the final `sleep 10` returns success; update the for-loop in the RUN
line that performs `for i in 1 2 3; do npm ci && break || { echo "npm ci attempt
$i failed; sleeping 10s"; sleep 10; }; done` so that if all attempts fail it
exits non‑zero (e.g., after the last sleep emit a failure message and `exit 1`
or propagate the failing status) ensuring the Docker build fails when `npm ci`
cannot succeed.

In `@scripts/export-staging-env.sh`:
- Around line 34-40: The emit() helper currently emits raw KEY=VALUE when mode
!= shell, which breaks on multiline secrets; update the else branch in emit() so
it safely serializes values by wrapping them in quotes and escaping newlines,
backslashes and quote characters (e.g., replace actual newlines with \n and
escape existing quotes/backslashes) before printing so multiline private keys
and other secrets produce a valid .env-style line; keep the shell branch
unchanged but ensure both branches use the same escaping routine referenced from
emit().

In `@tests/performance/db-tps-http.ts`:
- Line 17: The import currently uses a relative path string "./k6-supabase";
change that import to use the root alias by replacing "./k6-supabase" with
"@/tests/performance/k6-supabase" in the import statement in
tests/performance/db-tps-http.ts, then ensure TypeScript/TSConfig path aliases
are available (e.g., already configured) and run a quick type-check/build to
verify the alias resolves.

In `@tests/performance/k6-supabase.ts`:
- Around line 756-764: The JWT payload construction in buildE2EAutograderJwt
uses stringified numbers for run_id and run_attempt while
createSubmissionJwtToken uses numeric types, causing a type inconsistency;
update buildE2EAutograderJwt to assign run_id and run_attempt as numbers (not
String(...) or stringified) to match createSubmissionJwtToken, keeping the rest
of the payload and the encoding logic unchanged and referencing the run_id and
run_attempt fields in the payload and the buildE2EAutograderJwt function for the
change.

---

Nitpick comments:
In @.prettierignore:
- Around line 49-52: Update the .prettierignore entry that currently ignores the
entire charts/ directory to a more specific pattern so plain chart YAML (e.g.,
Chart.yaml, values.yaml, examples/values-preview.yaml) can be formatted while
template files remain ignored; replace or augment the existing "charts/" rule
with a targeted pattern such as excluding only template directories (for example
use a pattern like charts/*/templates/ or similar) so Prettier will still skip
Helm templates but will allow non-template chart files to be formatted.

In `@charts/pawtograder/images/edge-functions/Dockerfile`:
- Line 5: Update the Dockerfile's base image reference so builds are
reproducible by replacing the mutable tag used in the FROM statement (currently
"supabase/edge-runtime:v1.69.5") with the corresponding immutable digest form
("supabase/edge-runtime@sha256:..."); locate the FROM line in the Dockerfile and
pin it to the exact sha256 digest for v1.69.5 (or the specific digest you
verify), then run a test rebuild to confirm the digest is correct and that the
image pulls successfully.

In `@charts/pawtograder/images/edge-functions/main.ts`:
- Around line 31-32: Replace the loose suppression comment above the
EdgeRuntime.userWorkers.create call: remove the existing //@ts-ignore and use
//@ts-expect-error instead so TypeScript will error once proper typings exist;
ensure the `@ts-expect-error` is placed immediately above the line invoking
EdgeRuntime.userWorkers.create (the same spot where the current suppression
lives) and run type-check to verify no unexpected failures.

In `@charts/pawtograder/images/migrations/migrate.sh`:
- Around line 43-44: The current population of the sorted array using IFS and
command substitution (IFS=$'\n' sorted=( $(printf '%s\n' "${files[@]}" | sort)
); unset IFS) is flagged by ShellCheck SC2207; replace that pattern with a safer
mapfile approach by reading the sorted printf output into the array via mapfile
-t sorted < <(printf '%s\n' "${files[@]}" | sort) (keeping the same variables:
files and sorted) so array entries are not subject to word-splitting or globbing
issues.

In `@charts/pawtograder/templates/audit-partitions.yaml`:
- Around line 25-47: The psql container definition (container name "psql") is
missing resource requests/limits and a securityContext; add a resources block
with sensible CPU/memory requests and limits for a short-lived job and add a
securityContext on the container to enforce runAsNonRoot/runAsUser, set
allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, drop all
capabilities (and explicitly drop NET_RAW/ALL as appropriate), and set a
seccompProfile (RuntimeDefault) to harden execution; place these blocks under
the same container entry where "psql" is defined and ensure they follow Helm
templating conventions used elsewhere in the chart.

In `@charts/pawtograder/templates/kong.yaml`:
- Around line 39-87: Add a securityContext block to the Kong container
definition (the container with name "kong") to match the chart hardening: set
runAsUser to a non-root UID, runAsGroup, ensure readOnlyRootFilesystem: true,
allowPrivilegeEscalation: false, and drop all capabilities (with any needed adds
explicitly listed), and add seccompProfile: { type: RuntimeDefault } and fsGroup
at the pod level if not present; place the securityContext under the same
container object that contains ports/env/volumeMounts/readinessProbe so the
container runs non-root and with restricted privileges.
- Around line 91-118: The initContainer "render-config" currently uses an
unpinned image and has no securityContext; update the image tag to a specific
pinned patch (e.g., alpine:3.20.<patch>) and add a securityContext on the
initContainer that enforces least-privilege: runAsNonRoot: true with an explicit
runAsUser (e.g., 1000), drop all capabilities, set readOnlyRootFilesystem: false
(must remain false because /rendered is written), set a restrictive
seccompProfile (RuntimeDefault) and a minimal procMount if desired; apply these
changes to the initContainer definition named render-config so it matches the
chart hardening standards.

In `@charts/pawtograder/templates/meta.yaml`:
- Around line 34-76: Add scheduling controls for the meta Deployment by wiring
nodeSelector, tolerations, and affinity into the spec like other components:
update the spec for the meta container (in
charts/pawtograder/templates/meta.yaml) to include the same helper calls or
value references used elsewhere (e.g., .Values.meta.nodeSelector /
.Values.meta.tolerations / .Values.meta.affinity or the chart helper templates
such as include "pawtograder.nodeSelector" and include
"pawtograder.tolerations") immediately under the spec (alongside
serviceAccountName and imagePullSecrets) so meta can be scheduled consistent
with the other components.

In `@charts/pawtograder/templates/migrations-job.yaml`:
- Around line 118-124: The nodeSelector and tolerations helpers in
migrations-job are currently being passed .Values.postgres which ties migration
pod placement to Postgres; update the two include calls that pass the dict with
"component" .Values.postgres to instead pass "component" .Values.migrations so
pawtograder.nodeSelector and pawtograder.tolerations receive the migrations
scheduling values (change the include invocations around
pawtograder.nodeSelector and pawtograder.tolerations).

In `@charts/pawtograder/templates/secrets-bootstrap-script.yaml`:
- Line 124: The current assignment to realtimeEncKey uses
randomBytes(16).toString("base64").slice(0, 16) which truncates the base64
string and reduces entropy; instead generate and preserve the full 16 raw bytes
and encode them in a way the consumer expects: either keep the full base64
encoding of randomBytes(16) (do not slice) or export the raw 16 bytes
(randomBytes(16)) and encode to hex/base64 at the contract boundary; update the
code that reads realtimeEncKey accordingly so the downstream consumer receives a
true 16-byte AES-128 key rather than a truncated string.

In `@charts/pawtograder/templates/secrets.yaml`:
- Line 43: The secret name is hardcoded as "pawtograder-e2e"; change it to use
the chart value (.Values.secrets.names.e2e) so the E2E secret follows the same
configurable pattern as other secrets. Add an e2e key under secrets.names in
values.yaml (e.g., secrets.names.e2e) and update the template that currently
sets name: pawtograder-e2e to reference .Values.secrets.names.e2e (with a
sensible default fallback if your chart uses defaulting logic).

In `@charts/pawtograder/templates/storage.yaml`:
- Around line 86-91: REGION and GLOBAL_S3_BUCKET env vars are emitted
unconditionally even when storage.backend is "file"; wrap the two env entries
(the lines setting name: REGION and name: GLOBAL_S3_BUCKET that use
.Values.storage.s3.region and .Values.storage.s3.bucket) in a conditional block
using {{- if eq .Values.storage.backend "s3" }} ... {{- end }} so they are only
rendered for the s3 backend.

In `@charts/pawtograder/templates/supavisor.yaml`:
- Around line 95-101: Add a livenessProbe next to the existing readinessProbe in
the supavisor container template to detect hung processes and trigger restarts;
mirror the readinessProbe tcpSocket port "pool" but configure a more
conservative timing (for example initialDelaySeconds ~30, periodSeconds ~10,
failureThreshold ~3) so transient startup doesn't cause restarts, and place it
alongside readinessProbe and resources in the supavisor template that references
.Values.supavisor.

In `@scripts/GenerateJwtKeys.ts`:
- Around line 36-43: The signHS256 function manually builds and signs JWTs which
is brittle and unsafe; replace its manual implementation with a vetted JOSE
library (e.g., 'jose') by using the library's SignJWT (or equivalent) to set the
payload, call setProtectedHeader({ alg: 'HS256', kid }), and sign with the
appropriate HMAC secret key so the function returns the compact serialized JWT;
update imports and tests to use the JOSE signing flow and remove the manual
b64url/HMAC assembly in signHS256.
- Around line 82-85: safePass currently base64-encodes random bytes then strips
/ + = characters which can make results shorter than requested; update safePass
to compute the number of random bytes needed for a deterministic output length
(bytes = Math.ceil(desiredLength * 3 / 4)), generate that many bytes, perform
URL-safe base64 encoding by replacing '+' with '-' and '/' with '_' and removing
padding '=', then slice to the requested length; apply this new safePass (used
to produce postgresPassword and pawtograderPassword) so the returned passwords
always have the exact requested length and use a URL-safe alphabet.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9c045f75-cdcd-4c8b-83b2-344817e587c2

📥 Commits

Reviewing files that changed from the base of the PR and between 6840d3a and 9fc091b.

📒 Files selected for processing (50)

.github/workflows/preview.yml
.github/workflows/release-images.yml
.prettierignore
Dockerfile
charts/pawtograder/.helmignore
charts/pawtograder/Chart.yaml
charts/pawtograder/README.md
charts/pawtograder/examples/values-preview.yaml
charts/pawtograder/images/edge-functions/Dockerfile
charts/pawtograder/images/edge-functions/main.ts
charts/pawtograder/images/migrations/Dockerfile
charts/pawtograder/images/migrations/migrate.sh
charts/pawtograder/templates/NOTES.txt
charts/pawtograder/templates/_helpers.tpl
charts/pawtograder/templates/audit-partitions.yaml
charts/pawtograder/templates/auth.yaml
charts/pawtograder/templates/backup.yaml
charts/pawtograder/templates/edge-functions.yaml
charts/pawtograder/templates/imgproxy.yaml
charts/pawtograder/templates/ingress-studio.yaml
charts/pawtograder/templates/ingress.yaml
charts/pawtograder/templates/kong-config.yaml
charts/pawtograder/templates/kong.yaml
charts/pawtograder/templates/meta.yaml
charts/pawtograder/templates/migrations-job.yaml
charts/pawtograder/templates/postgres-config.yaml
charts/pawtograder/templates/postgres-statefulset.yaml
charts/pawtograder/templates/realtime.yaml
charts/pawtograder/templates/rest.yaml
charts/pawtograder/templates/secrets-bootstrap-job.yaml
charts/pawtograder/templates/secrets-bootstrap-rbac.yaml
charts/pawtograder/templates/secrets-bootstrap-script.yaml
charts/pawtograder/templates/secrets.yaml
charts/pawtograder/templates/serviceaccount.yaml
charts/pawtograder/templates/storage.yaml
charts/pawtograder/templates/studio.yaml
charts/pawtograder/templates/supavisor.yaml
charts/pawtograder/templates/web.yaml
charts/pawtograder/values.yaml
package.json
scripts/GenerateJwtKeys.ts
scripts/export-staging-env.sh
supabase/functions/autograder-create-submission/index.ts
supabase/functions/autograder-submit-feedback/index.ts
tests/e2e/grading.test.tsx
tests/performance/db-tps-http.ts
tests/performance/k6-supabase.ts
tests/performance/submissions-write-storm.ts
tsconfig.json
webpack.k6.config.js

Two CodeRabbit-flagged correctness fixes in the preview workflow: - Reuse-vs-generate gate now checks all four required Secrets (pawtograder-{jwt,postgres,e2e,s3}) instead of only pawtograder-jwt. Partial state is rejected loudly rather than silently auto-filled — regenerating a single secret in place would change keys/passwords for an in-flight preview without telling anyone. - "Wait for migrations" no longer swallows a kubectl wait timeout into a warning. A timed-out migrations Job means the schema didn't apply, which means the preview is broken; fail the deploy job so the comment-on-PR step doesn't post "Preview deployed" over a half-broken preview. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Addresses two CodeRabbit findings: - Branch tagging now distinguishes refs/heads/main from other branches. Pushes to non-main branches (helm today) get <branch>-${short_sha} and <branch>-latest instead of clobbering main-latest with unmerged branch content. Slashes in branch names are sanitized to dashes. - The web image is no longer built on tag (v*) or non-main branch pushes. NEXT_PUBLIC_* values are inlined into the client bundle by Next.js, so a non-staging build needs explicit URLs; the previous fallback silently baked staging URLs into release tags. The web build now runs only on workflow_dispatch (caller-supplied URLs) or push-to-main (intentional staging refresh). publish-chart no longer depends on build-web — chart references images by tag, and consumers override web.image.tag for their own environment with a workflow_dispatch-built image. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous retry loop ended on `sleep 10`, which always exits 0, so three consecutive `npm ci` failures still produced a successful image with broken dependencies. Track success explicitly and `test -eq 1` after the loop so the RUN step propagates a non-zero exit on total failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three CodeQL findings on this script are false positives that don't warrant changing the algorithm or contract: - js/insufficient-password-hash on createHmac("sha256"): HMAC-SHA256 is the JWT HS256 signing primitive (RFC 7518), not password hashing. - js/clear-text-logging on the POSTGRES_PASSWORD / PAWTOGRADER_PASSWORD console.log calls in --env mode: emitting fresh secrets to stdout for piping into bao or .env.local is the script's documented purpose. Add inline `// codeql[query-id]` annotations with one-line justifications so the next CodeQL scan picks them up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Raw `KEY=value` output broke for multiline secrets like GITHUB_PRIVATE_KEY_STRING and JWT_PRIVATE_JWKS — the embedded newlines split mid-value, producing an invalid `.env.local`. Quote the value and escape backslashes, double-quotes, CR, and LF so dotenv-style parsers (Next.js, dotenv) reconstruct the original. Apply to both the staging exporter and the new preview exporter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- edge-functions/main.ts: validate the request's first path segment against /^[a-zA-Z0-9_-]+$/ before composing servicePath, so inputs like ".." can't escape /home/deno/functions. - migrations/migrate.sh: pass filename-derived `version` and `name` to psql via --set + the :'var' literal-quote form rather than shell-interpolating into the SQL string. psql does the quoting, so unusual characters in filenames can't construct SQL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- db-tps-http.ts: import k6-supabase via the @/ root alias to match repo conventions. - k6-supabase.ts: drop the String() coercions on run_id / run_attempt in buildE2EAutograderJwt so its payload matches the numeric shape createSubmissionJwtToken already emits. Server-side handlers Number.parseInt either way, but this keeps the two helpers emitting identically-shaped tokens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ions env order - Postgres readiness/liveness probes were running pg_isready -U postgres, but the file's own comment notes that the postgres role is created later by migrate.sh. On first boot probes failed until migrations finished, keeping the pod NotReady (or tripping liveness). Switch both probes to the bootstrap superuser supabase_admin. - The container always mounts `name: data` but `volumeClaimTemplates` is guarded on `postgres.persistence.enabled`. With persistence=false the pod referenced a volume that didn't exist. Add an emptyDir fallback so the StatefulSet starts (with a clear comment that it's ephemeral). - In templates/edge-functions.yaml, SUPABASE_DB_URL referenced $(POSTGRES_PASSWORD) before POSTGRES_PASSWORD was defined in the env list. Kubernetes only interpolates earlier env vars, so the password was emitted as the literal string $(POSTGRES_PASSWORD), breaking DB connections. Reorder to match the same pattern auth.yaml uses. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

SECRET_KEY_BASE, VAULT_ENC_KEY, API_JWT_SECRET, and METRICS_JWT_SECRET all pointed at the same JWT_SECRET key. They have different length / role requirements (Phoenix endpoint key ≥ 64 chars, Vault encryption exactly 32 bytes, two unrelated JWT signers) and sharing a value either fails Vault init or lets a metrics scraper forge a tenant-management token. Add four dedicated keys to the JWT secret bundle (SUPAVISOR_{SECRET_KEY_BASE,VAULT_ENC_KEY,API_JWT_SECRET,METRICS_JWT_SECRET}) and wire them through: - templates/supavisor.yaml — point each env var at its own key - templates/secrets.yaml — render all four when supavisor.enabled=true - templates/secrets-bootstrap-script.yaml — generate them with the correct lengths in the autogenerate path - values.yaml — add empty placeholders to secrets.values.jwt Existing pawtograder-jwt secrets must be extended with these four keys when supavisor is enabled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ltime affinity, pin mc download - examples/values-preview.yaml had two top-level storage: blocks. YAML doesn't merge duplicate keys; the second silently overwrote the first, losing storage.replicas and storage.resources. Merge into a single block. - templates/ingress.yaml listed ingress.extraHosts in the TLS SANs but emitted no matching rule entries, so requests to those hostnames hit the ingress controller's default backend (404). Emit the same path set as the primary host for each extraHost. - templates/realtime.yaml had an empty `with` block in the spreadAcrossNodes branch that purported to let user-supplied affinity override the default anti-affinity, but the block discarded the included template's output and rendered nothing. Restructure so user-supplied component or global affinity wins, falling back to the spread default when neither is set. - templates/backup.yaml downloaded the mc binary at runtime with no integrity check. Pre-baking mc into the image is out of scope (the image is supabase/postgres, not chart-controlled), so pin URL + SHA256 together via backup.mc.{url,sha256} and verify before chmod/exec. Tampered downloads now fail before any S3 credentials are touched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…path The required-secrets table only listed JWT_SECRET, ANON_KEY, and SERVICE_ROLE_KEY for pawtograder-jwt, but auth.yaml, realtime.yaml, postgres-statefulset.yaml, storage.yaml, rest.yaml, meta.yaml, and studio.yaml also read JWT_PRIVATE_JWKS, JWT_PUBLIC_JWKS, JWT_REALTIME_JWKS, REALTIME_ENC_KEY, PG_META_CRYPTO_KEY, and PGSODIUM_ROOT_KEY (plus the SUPAVISOR_* keys when supavisor is on). Operators following the table verbatim with secrets.create=false ended up with a deployment that crash-loops on first start. Spell out every key the chart consumes and what it's for. Also: the README pointed at scripts/generate-jwt.ts but the actual script is scripts/GenerateJwtKeys.ts — broken on case-sensitive filesystems. values.yaml: backup.mc placeholder added in the previous commit gets a clarifying comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per-PR previews come up with an empty schema today — login works but there's nothing to look at. supabase/seed.sql already creates a Demo Class (id=1) plus canned auth.users / assignments / rubrics / submissions that the rest of the app expects to exist for the playground. Bake seed.sql + a small seed.sh wrapper into the migrations image, then add templates/seed-job.yaml as a post-install,post-upgrade Helm hook gated on `seed.enabled` (default false). The hook fires after migrations and supabase services are ready (post-install runs after all install resources reconcile, including the migrations Job). seed.sh is idempotent: it checks `SELECT 1 FROM public.classes WHERE id=1 AND is_demo=true` and exits 0 if present. PR-sync upgrades just re-fire the hook and skip immediately, so user-created data on top of the seed survives. Re-running seed.sql blindly would conflict on the classes.id=1 PK. values-preview.yaml flips seed.enabled=true. Production / staging keep the default-off so they never auto-create a demo class. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

+  // The whole point of --env mode is to print fresh secrets to stdout so the
+  // operator can pipe them into bao or .env.local. Suppress CodeQL's
+  // clear-text-logging rule here — emitting these is the script's contract.
+  console.log(`POSTGRES_PASSWORD=${postgresPassword}`); // codeql[js/clear-text-logging]


+  // operator can pipe them into bao or .env.local. Suppress CodeQL's
+  // clear-text-logging rule here — emitting these is the script's contract.
+  console.log(`POSTGRES_PASSWORD=${postgresPassword}`); // codeql[js/clear-text-logging]
+  console.log(`PAWTOGRADER_PASSWORD=${pawtograderPassword}`); // codeql[js/clear-text-logging]


jon-bell and others added 15 commits April 28, 2026 01:57

Add draft helm charts

17866ff

ci: trigger release-images on helm branch too

1933fcf

ci: run release-images jobs on pawtograder-ci runner

0549f84

The in-cluster runner has dind and the local GHA cache pre-wired, so docker builds finish faster and don't burn GitHub-hosted minutes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

lint

acfb156

.

b7d0118

Prep for submissions load tests

214784d

Merge remote-tracking branch 'origin/submissions-load-test' into helm

dbf8d2f

github-advanced-security AI found potential problems Apr 29, 2026

View reviewed changes

coderabbitai Bot requested changes Apr 29, 2026

View reviewed changes

jon-bell and others added 8 commits April 29, 2026 14:43

jon-bell and others added 4 commits April 29, 2026 14:47

github-advanced-security AI found potential problems Apr 29, 2026

View reviewed changes

Conversation

jon-bell commented Apr 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Uh oh!

argos-ci Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

jon-bell commented Apr 29, 2026

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jon-bell commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading

argos-ci Bot commented Apr 29, 2026 •

edited

Loading