|
| 1 | +# Preview Environments |
| 2 | + |
| 3 | +Every pull request opened against `main` can spin up an ephemeral, fully-deployed instance of omni-pitcher on the `homerun2-dev` Kubernetes cluster — alongside redis-stack so reviewers can `curl /pitch` and watch events land in the stream. The environment lives for as long as the PR is open and tears down automatically on merge or close. |
| 4 | + |
| 5 | +omni-pitcher was the pilot for the homerun2 PR-preview rollout; this is the original shape that core-catcher and scout now mirror. |
| 6 | + |
| 7 | +This page covers how to use it, what each PR gets, the components that make it work, and how to troubleshoot. |
| 8 | + |
| 9 | +## Quick start |
| 10 | + |
| 11 | +1. Open a PR against `main`. |
| 12 | +2. Add the `preview` label: `gh pr edit <num> --add-label preview`. |
| 13 | +3. Wait 5–10 minutes for the image build, the kustomize-OCI push, and Argo's PullRequest generator poll (every 600s). |
| 14 | +4. The preview-bot leaves a sticky comment on the PR with the URL. |
| 15 | + |
| 16 | +Closing or merging the PR tears the namespace down automatically. |
| 17 | + |
| 18 | +## What you get per PR |
| 19 | + |
| 20 | +Each preview lives in its own namespace: `homerun2-omni-pitcher-pr-<num>` on `homerun2-dev`. The namespace contains: |
| 21 | + |
| 22 | +| Workload | Purpose | |
| 23 | +|--|--| |
| 24 | +| `homerun2-omni-pitcher` | The system under test (this PR's commit) | |
| 25 | +| `redis-stack` | The bus omni-pitcher writes into; persistence disabled (ephemeral) | |
| 26 | +| `seed-test-events` (one-shot Job) | Posts a 5-event fixture to omni-pitcher right after the Deployment becomes Ready, so the stream is non-empty on first inspection | |
| 27 | + |
| 28 | +omni-pitcher is the SUT and has no co-tenants — unlike scout and core-catcher previews, there's no upstream/downstream component to pair with. Reviewers exercise it directly via `curl`. |
| 29 | + |
| 30 | +Reachable at: `https://omni-pr-<num>.homerun2-dev.sthings-vsphere.labul.sva.de` |
| 31 | + |
| 32 | +## Why the `preview` label gate |
| 33 | + |
| 34 | +Without the label, every renovate / dependabot dep-bump PR would spawn a namespace. Two problems: |
| 35 | + |
| 36 | +- Branches predating the build-pr workflow have no `pr-<num>-<sha>` image or kustomize artifacts published — half-empty namespaces with sync errors. |
| 37 | +- Bots open dozens of PRs per week; the preview infrastructure isn't built for that scale. |
| 38 | + |
| 39 | +Human-opened PRs opt in via the label. Bots don't apply it, so they're excluded by default. The Argo AppSet's PullRequest generator filters on `labels: [preview]`. |
| 40 | + |
| 41 | +## The flow, end to end |
| 42 | + |
| 43 | +``` |
| 44 | +git push (PR opens) |
| 45 | + ├─► comment-preview-url.yaml ─► sticky bot comment with URL |
| 46 | + ├─► build-scan-image.yaml ─► ko-built image at ghcr.io/.../homerun2-omni-pitcher:pr-<num>-<sha> |
| 47 | + ├─► push-kustomize-pr.yaml ─► kustomize OCI at ghcr.io/.../homerun2-omni-pitcher-kustomize:pr-<num>-<sha> |
| 48 | + └─► build-test.yaml + lint ─► CI gates |
| 49 | +
|
| 50 | +Argo PullRequest generator (poll every 600s) |
| 51 | + └─► detects PR with `preview` label |
| 52 | + └─► renders parent Application `homerun2-omni-pitcher-pr-<num>` in argocd ns |
| 53 | + └─► chart emits child Applications targeting `homerun2-omni-pitcher-pr-<num>` ns |
| 54 | + on the homerun2-dev cluster |
| 55 | +
|
| 56 | +Kyverno ClusterPolicies (auto-fire on namespace create) |
| 57 | + ├─► generate ResourceQuota + LimitRange |
| 58 | + ├─► generate 3 ExternalSecrets → ESO materializes Secrets from Vault |
| 59 | + └─► generate one-shot seed Job (posts fixture after Deployment Ready) |
| 60 | +
|
| 61 | +PR close |
| 62 | + ├─► AppSet drops the entry → finalizer cascade prunes child Apps + workloads |
| 63 | + ├─► cleanup-pr-artifacts.yaml deletes both ghcr.io packages |
| 64 | + └─► Kyverno ClusterCleanupPolicy reaps any empty namespace shell left behind |
| 65 | +``` |
| 66 | + |
| 67 | +## The four PR-preview workflows in this repo |
| 68 | + |
| 69 | +All four are in `.github/workflows/` and trigger on `pull_request` events targeting `main`. |
| 70 | + |
| 71 | +| Workflow | Trigger | Output | |
| 72 | +|--|--|--| |
| 73 | +| `build-scan-image.yaml` | PR opened/updated | ko-built image tagged `pr-<num>-<sha>` + `pr-<num>` | |
| 74 | +| `push-kustomize-pr.yaml` | PR opened/updated | kustomize OCI tagged `pr-<num>-<sha>` (renders `kcl/main.k` against `tests/kcl-deploy-profile.yaml`) | |
| 75 | +| `comment-preview-url.yaml` | PR opened/reopened | Sticky comment with URL, namespace, ArgoCD link. Thin caller of `stuttgart-things/github-workflow-templates/.github/workflows/call-comment-preview-url.yaml` | |
| 76 | +| `cleanup-pr-artifacts.yaml` | PR closed | Deletes both ghcr.io packages so version histories don't fill with PR debris | |
| 77 | + |
| 78 | +All four delegate to reusable workflows in `stuttgart-things/github-workflow-templates`. |
| 79 | + |
| 80 | +## The Argo AppSet, briefly |
| 81 | + |
| 82 | +Lives at `stuttgart-things/stuttgart-things` under `clusters/labul/vsphere/platform-sthings/argocd/homerun2-dev/omni-pitcher-pr-preview-appset.yaml`. The shape: |
| 83 | + |
| 84 | +```yaml |
| 85 | +apiVersion: argoproj.io/v1alpha1 |
| 86 | +kind: ApplicationSet |
| 87 | +metadata: |
| 88 | + name: homerun2-omni-pitcher-pr-preview |
| 89 | + namespace: argocd |
| 90 | +spec: |
| 91 | + generators: |
| 92 | + - pullRequest: |
| 93 | + github: |
| 94 | + owner: stuttgart-things |
| 95 | + repo: homerun2-omni-pitcher |
| 96 | + tokenRef: { secretName: homerun2-omni-pitcher-pat, key: token } |
| 97 | + labels: [preview] # the gate |
| 98 | + requeueAfterSeconds: 600 # poll cadence |
| 99 | + template: |
| 100 | + metadata: |
| 101 | + name: 'homerun2-omni-pitcher-pr-{{ .number }}' |
| 102 | + finalizers: [resources-finalizer.argocd.argoproj.io] # cascade on prune |
| 103 | + spec: |
| 104 | + source: |
| 105 | + repoURL: https://github.com/stuttgart-things/argocd.git |
| 106 | + path: apps/homerun2/install |
| 107 | + helm: |
| 108 | + valuesObject: |
| 109 | + destination: |
| 110 | + name: homerun2-dev |
| 111 | + namespace: 'homerun2-omni-pitcher-pr-{{ .number }}' |
| 112 | + omniPitcher: |
| 113 | + enabled: true |
| 114 | + version: 'pr-{{ .number }}-{{ .head_sha }}' |
| 115 | + hostname: 'omni-pr-{{ .number }}.homerun2-dev.sthings-vsphere.labul.sva.de' |
| 116 | + inlineHttpRoute: true # Option B — see below |
| 117 | + redisStack: |
| 118 | + enabled: true |
| 119 | + persistence: { enabled: false } |
| 120 | + auth: { existingSecret: redis-stack-auth } |
| 121 | + # all other components off |
| 122 | + httpRoute: |
| 123 | + enabled: true |
| 124 | + gateway: { name: homerun2-dev-gateway, namespace: default } |
| 125 | + syncPolicy: |
| 126 | + automated: { prune: true, selfHeal: true } |
| 127 | + syncOptions: [CreateNamespace=true, ServerSideApply=true] |
| 128 | +``` |
| 129 | +
|
| 130 | +The AppSet renders one **parent** Argo `Application` per labelled PR. The parent's source is the `apps/homerun2/install` chart in the `stuttgart-things/argocd` catalog. The chart emits **child** Applications (one per enabled component: omni-pitcher, redis-stack) on the homerun2-dev cluster. |
| 131 | + |
| 132 | +`destination.name: homerun2-dev` (not a URL) means the chart targets the workload cluster by its registered Argo cluster name, so IP / DNS changes don't break manifests. |
| 133 | + |
| 134 | +## The five cluster overlay manifests |
| 135 | + |
| 136 | +Sit alongside the AppSet in `…/argocd/homerun2-dev/`: |
| 137 | + |
| 138 | +| File | What it does | |
| 139 | +|--|--| |
| 140 | +| `omni-pitcher-pr-preview-appset.yaml` | The ApplicationSet above | |
| 141 | +| `homerun2-omni-pitcher-preview-quota.yaml` | Kyverno `ClusterPolicy` → generates `ResourceQuota` + `LimitRange` in each PR namespace | |
| 142 | +| `homerun2-omni-pitcher-preview-secrets.yaml` | Kyverno `ClusterPolicy` → generates 3 `ExternalSecret`s; ESO pulls from Vault `homerun2-pr/data/preview-env` | |
| 143 | +| `homerun2-omni-pitcher-preview-seed-data.yaml` | Kyverno `ClusterPolicy` → generates the one-shot seed Job | |
| 144 | +| `homerun2-omni-pitcher-preview-sweep.yaml` | Kyverno `ClusterCleanupPolicy` → cron-reaps empty PR namespace shells | |
| 145 | + |
| 146 | +These are deployed *once per cluster*. Per-PR, they fire automatically when the AppSet creates the namespace. |
| 147 | + |
| 148 | +## HTTPRoute: Option B (inline in the kustomize OCI) |
| 149 | + |
| 150 | +The HTTPRoute exposing omni-pitcher externally is rendered by `kcl/httproute.k` and ships **inside the kustomize OCI**, alongside the Service. They land in the same kustomize apply, eliminating the cross-Application race that previously let Cilium's gateway controller stamp a sticky `BackendNotFound` (tracked under [stuttgart-things/argocd#116](https://github.com/stuttgart-things/argocd/issues/116)). This repo was the first to ship Option B; the chart-side helper + flag landed in [stuttgart-things/argocd#117](https://github.com/stuttgart-things/argocd/pull/117) and [#119](https://github.com/stuttgart-things/argocd/pull/119). Three places have to agree: |
| 151 | + |
| 152 | +| Repo | Setting | |
| 153 | +|--|--| |
| 154 | +| `homerun2-omni-pitcher` (this repo) | `tests/kcl-deploy-profile.yaml` → `config.httpRouteEnabled: true` | |
| 155 | +| `stuttgart-things/argocd` | `apps/homerun2/install` → `omniPitcher.inlineHttpRoute` flag patches the rendered HTTPRoute's parentRef + hostname per env, and excludes omni-pitcher from the standalone httproute Application | |
| 156 | +| `stuttgart-things/stuttgart-things` | Set `omniPitcher.inlineHttpRoute: true` in the AppSet's `valuesObject` | |
| 157 | + |
| 158 | +With all three set, `HTTPRoute/homerun2-omni-pitcher` lands `ResolvedRefs: True` on first reconcile. No manual `kubectl annotate httproute reconcile-bump=$(date +%s) --overwrite` required. |
| 159 | + |
| 160 | +Admission-defaulted fields (`parentRefs.group`/`kind`, `backendRefs.group`/`kind`/`weight`) are rendered explicitly by `kcl/httproute.k` so the chart-rendered shape matches what Cilium writes back — no perpetual `OutOfSync` from defaulting drift. |
| 161 | + |
| 162 | +## Lifecycle |
| 163 | + |
| 164 | +| Event | Result | |
| 165 | +|--|--| |
| 166 | +| PR opened with `preview` label | Sticky bot comment posted; CI builds image + kustomize OCI; AppSet picks it up within 600s; namespace + workloads spin up | |
| 167 | +| PR updated (new commit) | Image + kustomize OCI rebuilt with new `<sha>`; AppSet detects the head-SHA change; rolling update of Deployments | |
| 168 | +| PR `preview` label removed | AppSet drops the entry; finalizer prune cascades teardown | |
| 169 | +| PR closed (merged or rejected) | AppSet drops the entry → teardown; `cleanup-pr-artifacts.yaml` deletes ghcr.io packages | |
| 170 | + |
| 171 | +The `resources-finalizer.argocd.argoproj.io` finalizer on the parent Application is critical — without it, Argo would delete the parent instantly when the AppSet drops it, orphaning child Apps + workload pods. With it, Argo runs prune on every managed resource first. |
| 172 | + |
| 173 | +## Troubleshooting |
| 174 | + |
| 175 | +| Symptom | Likely cause | Fix | |
| 176 | +|--|--|--| |
| 177 | +| No bot comment, no namespace | `preview` label missing | `gh pr edit <num> --add-label preview` | |
| 178 | +| Bot comment present, namespace never appears | AppSet hasn't polled yet | Wait up to 10 min, or `kubectl -n argocd annotate appset homerun2-omni-pitcher-pr-preview argocd.argoproj.io/refresh=hard` | |
| 179 | +| Parent Application sync error: `failed to load: oci pull` | Image / kustomize OCI build still running or failed | Check the PR's Actions tab — `build-pr` and `push-kustomize` must both be green | |
| 180 | +| Pods stuck `ImagePullBackOff` | ghcr.io tag not yet pushed (CI still running) or PR closed (cleanup workflow already ran) | Wait for build / reopen the PR | |
| 181 | +| Pods CrashLoopBackOff with `WRONGPASS` | ESO hasn't materialized `redis-stack-auth` Secret yet | Check `kubectl -n homerun2-omni-pitcher-pr-<num> get externalsecret`; refresh if not Ready | |
| 182 | +| HTTPRoute `ResolvedRefs: False` | Service didn't land before HTTPRoute (pre-Option-B environments only) | Should not happen now; if it does: `kubectl annotate httproute homerun2-omni-pitcher reconcile-bump=$(date +%s) --overwrite -n homerun2-omni-pitcher-pr-<num>` and file an issue | |
| 183 | +| `POST /pitch` returns 401 | `AUTH_TOKEN` env not set on the Deployment or `homerun2-omni-pitcher-token` Secret not materialized | Check the Deployment env + the per-namespace ExternalSecret status | |
| 184 | +| `POST /pitch` returns 500 with `WRONGPASS` | omni-pitcher started before redis-stack was ready, retried, gave up | Should be smoothed by the bounded 30s startup retry; if seen, restart the pod | |
| 185 | +| Seed Job ran but only 4 events posted | Known shell-script JSON-splitting bug — last event drops | Tracked as a follow-up in [stuttgart-things/homerun2-omni-pitcher#116](https://github.com/stuttgart-things/homerun2-omni-pitcher/issues/116) | |
| 186 | +| Namespace stuck Terminating after PR close | Finalizer on a CRD instance | `kubectl get all,externalsecret -n homerun2-omni-pitcher-pr-<num>` to find the blocker | |
| 187 | + |
| 188 | +## See also |
| 189 | + |
| 190 | +- [stuttgart-things/argocd `apps/homerun2`](https://github.com/stuttgart-things/argocd/tree/main/apps/homerun2) — the install chart + Kyverno policy charts the AppSet consumes |
| 191 | +- [stuttgart-things/homerun2-omni-pitcher#116](https://github.com/stuttgart-things/homerun2-omni-pitcher/issues/116) — the umbrella rollout issue tracking all 8 components |
| 192 | +- [stuttgart-things/argocd#116](https://github.com/stuttgart-things/argocd/issues/116) — the HTTPRoute creation-order race writeup that motivated Option B |
| 193 | +- [stuttgart-things/github-workflow-templates](https://github.com/stuttgart-things/github-workflow-templates) — the four reusable PR-preview workflows this repo delegates to |
0 commit comments