hub: optional startupProbe for slow-startup migration windows#44
Open
brandonSc wants to merge 2 commits into
Open
hub: optional startupProbe for slow-startup migration windows#44brandonSc wants to merge 2 commits into
brandonSc wants to merge 2 commits into
Conversation
8f8e0c3 to
412afda
Compare
Adds an optional hub.startupProbe (disabled by default to preserve existing behaviour). When enabled, kubelet gives Hub the full failureThreshold × periodSeconds window to become Ready before the liveness and readiness probes start counting failures. Default tuning when enabled is 30 × 5s = 150s. Useful on a cold start where Hub takes longer than the existing liveness probe's tolerance (default 15s, periodSeconds=5 × failureThreshold=3) — for example when Hub has to run a substantial Postgres schema migration as part of an upgrade. Without this probe, large migrations risk being interrupted mid-flight by kubelet restarting the pod. Backward compatible: existing installs see no change unless they explicitly set hub.startupProbe.enabled=true.
412afda to
9706430
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an optional
hub.startupProbe(disabled by default to preserve existing behaviour). When enabled, kubelet gives Hub the fullfailureThreshold × periodSecondswindow to become Ready before liveness/readiness probes start counting failures.Default tuning when enabled:
30 × 5s = 150sstartup window.Why
On 2026-05-26 the
earthly-internaldogfood cluster was rolled from Hub2.0.0→e845ff9d. The new Hub takes ~52s to become Ready (Postgres schema migration v175 → v182 + the newsqlapi_userrole creation). The existing chart probes are configured withinitialDelaySeconds=0, periodSeconds=5, failureThreshold=3— a 15s tolerance window — so kubelet killed the pod mid-migration in a crashloop.The workaround was an out-of-band
kubectl patch deploy/lunar-hubadding a startupProbe imperatively. That's incompatible with a futurehelm upgrade(would wipe the patch).Behaviour
hub.startupProbe.enabled: false(default)hub.startupProbe.enabled: truestartupProbe:block with/healthHTTP probeWhile the startupProbe is running, kubelet suppresses liveness/readiness probe failures. Once the startupProbe succeeds for the first time, control hands back to liveness/readiness for ongoing health checks (no double-coverage).
Test plan
helm templatewithhub.startupProbe.enabled=truerenders the full block (verified locally)helm templatewith default values renders zerostartupProbe:references (verified locally —grep -c startupProbereturns 0)earthly-internalcluster (replaces the kubectl-patched stopgap with chart-managed config) once this lands and the chart version bumpsNotes
hub.livenessProbe/hub.readinessProbeshape exactly for consistency.This PR was drafted by AI.