Skip to content

Commit 7e27885

Browse files
Merge pull request #129 from stuttgart-things/docs/preview-env-docs
docs(cicd): document PR-preview env
2 parents 0a3591f + f3ee5e5 commit 7e27885

3 files changed

Lines changed: 211 additions & 1 deletion

File tree

docs/cicd.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,29 @@
22

33
## GitHub Actions Workflows
44

5+
### Core
6+
57
| Workflow | Trigger | Description |
68
|----------|---------|-------------|
79
| `build-test.yaml` | PR / push to main | Dagger lint + build + test |
8-
| `build-scan-image.yaml` | Push to main | ko build + Trivy scan |
10+
| `build-scan-image.yaml` | PR / push to main | ko build + Trivy scan; PR job tags `pr-<num>-<sha>` for preview envs, main job tags `:main` |
911
| `release.yaml` | After image build / manual | Semantic release + stage image + push kustomize OCI |
12+
| `pages.yaml` | After release / manual | Deploy MkDocs to GitHub Pages |
1013
| `lint-repo.yaml` | PR / push to main | Repository linting |
1114

15+
### PR-preview env
16+
17+
These four together drive the per-PR ephemeral preview environment on `homerun2-dev` for PRs carrying the `preview` label.
18+
19+
| Workflow | Trigger | Description |
20+
|----------|---------|-------------|
21+
| `build-scan-image.yaml` (PR job) | PR opened/updated | ko image tagged `pr-<num>-<sha>` + `pr-<num>` consumed by the per-PR ArgoCD Application |
22+
| `push-kustomize-pr.yaml` | PR opened/updated | Kustomize OCI tagged `pr-<num>-<sha>` (renders `kcl/main.k` against `tests/kcl-deploy-profile.yaml`) |
23+
| `comment-preview-url.yaml` | PR opened/reopened | Sticky bot comment with the preview URL, namespace, and ArgoCD link |
24+
| `cleanup-pr-artifacts.yaml` | PR closed | Deletes both ghcr.io packages so version histories don't fill with PR debris |
25+
26+
See [Preview Environments](preview-environments.md) for the full flow, AppSet anatomy, and troubleshooting.
27+
1228
## Dagger Functions
1329

1430
The `dagger/` module provides:

docs/preview-environments.md

Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
# Preview Environments
2+
3+
Every pull request opened against `main` can spin up an ephemeral, fully-deployed instance of omni-pitcher on the `homerun2-dev` Kubernetes cluster — alongside redis-stack so reviewers can `curl /pitch` and watch events land in the stream. The environment lives for as long as the PR is open and tears down automatically on merge or close.
4+
5+
omni-pitcher was the pilot for the homerun2 PR-preview rollout; this is the original shape that core-catcher and scout now mirror.
6+
7+
This page covers how to use it, what each PR gets, the components that make it work, and how to troubleshoot.
8+
9+
## Quick start
10+
11+
1. Open a PR against `main`.
12+
2. Add the `preview` label: `gh pr edit <num> --add-label preview`.
13+
3. Wait 5–10 minutes for the image build, the kustomize-OCI push, and Argo's PullRequest generator poll (every 600s).
14+
4. The preview-bot leaves a sticky comment on the PR with the URL.
15+
16+
Closing or merging the PR tears the namespace down automatically.
17+
18+
## What you get per PR
19+
20+
Each preview lives in its own namespace: `homerun2-omni-pitcher-pr-<num>` on `homerun2-dev`. The namespace contains:
21+
22+
| Workload | Purpose |
23+
|--|--|
24+
| `homerun2-omni-pitcher` | The system under test (this PR's commit) |
25+
| `redis-stack` | The bus omni-pitcher writes into; persistence disabled (ephemeral) |
26+
| `seed-test-events` (one-shot Job) | Posts a 5-event fixture to omni-pitcher right after the Deployment becomes Ready, so the stream is non-empty on first inspection |
27+
28+
omni-pitcher is the SUT and has no co-tenants — unlike scout and core-catcher previews, there's no upstream/downstream component to pair with. Reviewers exercise it directly via `curl`.
29+
30+
Reachable at: `https://omni-pr-<num>.homerun2-dev.sthings-vsphere.labul.sva.de`
31+
32+
## Why the `preview` label gate
33+
34+
Without the label, every renovate / dependabot dep-bump PR would spawn a namespace. Two problems:
35+
36+
- Branches predating the build-pr workflow have no `pr-<num>-<sha>` image or kustomize artifacts published — half-empty namespaces with sync errors.
37+
- Bots open dozens of PRs per week; the preview infrastructure isn't built for that scale.
38+
39+
Human-opened PRs opt in via the label. Bots don't apply it, so they're excluded by default. The Argo AppSet's PullRequest generator filters on `labels: [preview]`.
40+
41+
## The flow, end to end
42+
43+
```
44+
git push (PR opens)
45+
├─► comment-preview-url.yaml ─► sticky bot comment with URL
46+
├─► build-scan-image.yaml ─► ko-built image at ghcr.io/.../homerun2-omni-pitcher:pr-<num>-<sha>
47+
├─► push-kustomize-pr.yaml ─► kustomize OCI at ghcr.io/.../homerun2-omni-pitcher-kustomize:pr-<num>-<sha>
48+
└─► build-test.yaml + lint ─► CI gates
49+
50+
Argo PullRequest generator (poll every 600s)
51+
└─► detects PR with `preview` label
52+
└─► renders parent Application `homerun2-omni-pitcher-pr-<num>` in argocd ns
53+
└─► chart emits child Applications targeting `homerun2-omni-pitcher-pr-<num>` ns
54+
on the homerun2-dev cluster
55+
56+
Kyverno ClusterPolicies (auto-fire on namespace create)
57+
├─► generate ResourceQuota + LimitRange
58+
├─► generate 3 ExternalSecrets → ESO materializes Secrets from Vault
59+
└─► generate one-shot seed Job (posts fixture after Deployment Ready)
60+
61+
PR close
62+
├─► AppSet drops the entry → finalizer cascade prunes child Apps + workloads
63+
├─► cleanup-pr-artifacts.yaml deletes both ghcr.io packages
64+
└─► Kyverno ClusterCleanupPolicy reaps any empty namespace shell left behind
65+
```
66+
67+
## The four PR-preview workflows in this repo
68+
69+
All four are in `.github/workflows/` and trigger on `pull_request` events targeting `main`.
70+
71+
| Workflow | Trigger | Output |
72+
|--|--|--|
73+
| `build-scan-image.yaml` | PR opened/updated | ko-built image tagged `pr-<num>-<sha>` + `pr-<num>` |
74+
| `push-kustomize-pr.yaml` | PR opened/updated | kustomize OCI tagged `pr-<num>-<sha>` (renders `kcl/main.k` against `tests/kcl-deploy-profile.yaml`) |
75+
| `comment-preview-url.yaml` | PR opened/reopened | Sticky comment with URL, namespace, ArgoCD link. Thin caller of `stuttgart-things/github-workflow-templates/.github/workflows/call-comment-preview-url.yaml` |
76+
| `cleanup-pr-artifacts.yaml` | PR closed | Deletes both ghcr.io packages so version histories don't fill with PR debris |
77+
78+
All four delegate to reusable workflows in `stuttgart-things/github-workflow-templates`.
79+
80+
## The Argo AppSet, briefly
81+
82+
Lives at `stuttgart-things/stuttgart-things` under `clusters/labul/vsphere/platform-sthings/argocd/homerun2-dev/omni-pitcher-pr-preview-appset.yaml`. The shape:
83+
84+
```yaml
85+
apiVersion: argoproj.io/v1alpha1
86+
kind: ApplicationSet
87+
metadata:
88+
name: homerun2-omni-pitcher-pr-preview
89+
namespace: argocd
90+
spec:
91+
generators:
92+
- pullRequest:
93+
github:
94+
owner: stuttgart-things
95+
repo: homerun2-omni-pitcher
96+
tokenRef: { secretName: homerun2-omni-pitcher-pat, key: token }
97+
labels: [preview] # the gate
98+
requeueAfterSeconds: 600 # poll cadence
99+
template:
100+
metadata:
101+
name: 'homerun2-omni-pitcher-pr-{{ .number }}'
102+
finalizers: [resources-finalizer.argocd.argoproj.io] # cascade on prune
103+
spec:
104+
source:
105+
repoURL: https://github.com/stuttgart-things/argocd.git
106+
path: apps/homerun2/install
107+
helm:
108+
valuesObject:
109+
destination:
110+
name: homerun2-dev
111+
namespace: 'homerun2-omni-pitcher-pr-{{ .number }}'
112+
omniPitcher:
113+
enabled: true
114+
version: 'pr-{{ .number }}-{{ .head_sha }}'
115+
hostname: 'omni-pr-{{ .number }}.homerun2-dev.sthings-vsphere.labul.sva.de'
116+
inlineHttpRoute: true # Option B — see below
117+
redisStack:
118+
enabled: true
119+
persistence: { enabled: false }
120+
auth: { existingSecret: redis-stack-auth }
121+
# all other components off
122+
httpRoute:
123+
enabled: true
124+
gateway: { name: homerun2-dev-gateway, namespace: default }
125+
syncPolicy:
126+
automated: { prune: true, selfHeal: true }
127+
syncOptions: [CreateNamespace=true, ServerSideApply=true]
128+
```
129+
130+
The AppSet renders one **parent** Argo `Application` per labelled PR. The parent's source is the `apps/homerun2/install` chart in the `stuttgart-things/argocd` catalog. The chart emits **child** Applications (one per enabled component: omni-pitcher, redis-stack) on the homerun2-dev cluster.
131+
132+
`destination.name: homerun2-dev` (not a URL) means the chart targets the workload cluster by its registered Argo cluster name, so IP / DNS changes don't break manifests.
133+
134+
## The five cluster overlay manifests
135+
136+
Sit alongside the AppSet in `…/argocd/homerun2-dev/`:
137+
138+
| File | What it does |
139+
|--|--|
140+
| `omni-pitcher-pr-preview-appset.yaml` | The ApplicationSet above |
141+
| `homerun2-omni-pitcher-preview-quota.yaml` | Kyverno `ClusterPolicy` → generates `ResourceQuota` + `LimitRange` in each PR namespace |
142+
| `homerun2-omni-pitcher-preview-secrets.yaml` | Kyverno `ClusterPolicy` → generates 3 `ExternalSecret`s; ESO pulls from Vault `homerun2-pr/data/preview-env` |
143+
| `homerun2-omni-pitcher-preview-seed-data.yaml` | Kyverno `ClusterPolicy` → generates the one-shot seed Job |
144+
| `homerun2-omni-pitcher-preview-sweep.yaml` | Kyverno `ClusterCleanupPolicy` → cron-reaps empty PR namespace shells |
145+
146+
These are deployed *once per cluster*. Per-PR, they fire automatically when the AppSet creates the namespace.
147+
148+
## HTTPRoute: Option B (inline in the kustomize OCI)
149+
150+
The HTTPRoute exposing omni-pitcher externally is rendered by `kcl/httproute.k` and ships **inside the kustomize OCI**, alongside the Service. They land in the same kustomize apply, eliminating the cross-Application race that previously let Cilium's gateway controller stamp a sticky `BackendNotFound` (tracked under [stuttgart-things/argocd#116](https://github.com/stuttgart-things/argocd/issues/116)). This repo was the first to ship Option B; the chart-side helper + flag landed in [stuttgart-things/argocd#117](https://github.com/stuttgart-things/argocd/pull/117) and [#119](https://github.com/stuttgart-things/argocd/pull/119). Three places have to agree:
151+
152+
| Repo | Setting |
153+
|--|--|
154+
| `homerun2-omni-pitcher` (this repo) | `tests/kcl-deploy-profile.yaml` → `config.httpRouteEnabled: true` |
155+
| `stuttgart-things/argocd` | `apps/homerun2/install` → `omniPitcher.inlineHttpRoute` flag patches the rendered HTTPRoute's parentRef + hostname per env, and excludes omni-pitcher from the standalone httproute Application |
156+
| `stuttgart-things/stuttgart-things` | Set `omniPitcher.inlineHttpRoute: true` in the AppSet's `valuesObject` |
157+
158+
With all three set, `HTTPRoute/homerun2-omni-pitcher` lands `ResolvedRefs: True` on first reconcile. No manual `kubectl annotate httproute reconcile-bump=$(date +%s) --overwrite` required.
159+
160+
Admission-defaulted fields (`parentRefs.group`/`kind`, `backendRefs.group`/`kind`/`weight`) are rendered explicitly by `kcl/httproute.k` so the chart-rendered shape matches what Cilium writes back — no perpetual `OutOfSync` from defaulting drift.
161+
162+
## Lifecycle
163+
164+
| Event | Result |
165+
|--|--|
166+
| PR opened with `preview` label | Sticky bot comment posted; CI builds image + kustomize OCI; AppSet picks it up within 600s; namespace + workloads spin up |
167+
| PR updated (new commit) | Image + kustomize OCI rebuilt with new `<sha>`; AppSet detects the head-SHA change; rolling update of Deployments |
168+
| PR `preview` label removed | AppSet drops the entry; finalizer prune cascades teardown |
169+
| PR closed (merged or rejected) | AppSet drops the entry → teardown; `cleanup-pr-artifacts.yaml` deletes ghcr.io packages |
170+
171+
The `resources-finalizer.argocd.argoproj.io` finalizer on the parent Application is critical — without it, Argo would delete the parent instantly when the AppSet drops it, orphaning child Apps + workload pods. With it, Argo runs prune on every managed resource first.
172+
173+
## Troubleshooting
174+
175+
| Symptom | Likely cause | Fix |
176+
|--|--|--|
177+
| No bot comment, no namespace | `preview` label missing | `gh pr edit <num> --add-label preview` |
178+
| Bot comment present, namespace never appears | AppSet hasn't polled yet | Wait up to 10 min, or `kubectl -n argocd annotate appset homerun2-omni-pitcher-pr-preview argocd.argoproj.io/refresh=hard` |
179+
| Parent Application sync error: `failed to load: oci pull` | Image / kustomize OCI build still running or failed | Check the PR's Actions tab — `build-pr` and `push-kustomize` must both be green |
180+
| Pods stuck `ImagePullBackOff` | ghcr.io tag not yet pushed (CI still running) or PR closed (cleanup workflow already ran) | Wait for build / reopen the PR |
181+
| Pods CrashLoopBackOff with `WRONGPASS` | ESO hasn't materialized `redis-stack-auth` Secret yet | Check `kubectl -n homerun2-omni-pitcher-pr-<num> get externalsecret`; refresh if not Ready |
182+
| HTTPRoute `ResolvedRefs: False` | Service didn't land before HTTPRoute (pre-Option-B environments only) | Should not happen now; if it does: `kubectl annotate httproute homerun2-omni-pitcher reconcile-bump=$(date +%s) --overwrite -n homerun2-omni-pitcher-pr-<num>` and file an issue |
183+
| `POST /pitch` returns 401 | `AUTH_TOKEN` env not set on the Deployment or `homerun2-omni-pitcher-token` Secret not materialized | Check the Deployment env + the per-namespace ExternalSecret status |
184+
| `POST /pitch` returns 500 with `WRONGPASS` | omni-pitcher started before redis-stack was ready, retried, gave up | Should be smoothed by the bounded 30s startup retry; if seen, restart the pod |
185+
| Seed Job ran but only 4 events posted | Known shell-script JSON-splitting bug — last event drops | Tracked as a follow-up in [stuttgart-things/homerun2-omni-pitcher#116](https://github.com/stuttgart-things/homerun2-omni-pitcher/issues/116) |
186+
| Namespace stuck Terminating after PR close | Finalizer on a CRD instance | `kubectl get all,externalsecret -n homerun2-omni-pitcher-pr-<num>` to find the blocker |
187+
188+
## See also
189+
190+
- [stuttgart-things/argocd `apps/homerun2`](https://github.com/stuttgart-things/argocd/tree/main/apps/homerun2) — the install chart + Kyverno policy charts the AppSet consumes
191+
- [stuttgart-things/homerun2-omni-pitcher#116](https://github.com/stuttgart-things/homerun2-omni-pitcher/issues/116) — the umbrella rollout issue tracking all 8 components
192+
- [stuttgart-things/argocd#116](https://github.com/stuttgart-things/argocd/issues/116) — the HTTPRoute creation-order race writeup that motivated Option B
193+
- [stuttgart-things/github-workflow-templates](https://github.com/stuttgart-things/github-workflow-templates) — the four reusable PR-preview workflows this repo delegates to

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ nav:
66
- API Usage: api-usage.md
77
- Deployment: deployment.md
88
- CI/CD: cicd.md
9+
- Preview Environments: preview-environments.md
910

1011
plugins:
1112
- techdocs-core

0 commit comments

Comments
 (0)