Skip to content

ci: Improve Couchbase K8s resilience#3527

Merged
tippmar-nr merged 2 commits intomainfrom
ci/couchbase-resilience
Apr 6, 2026
Merged

ci: Improve Couchbase K8s resilience#3527
tippmar-nr merged 2 commits intomainfrom
ci/couchbase-resilience

Conversation

@tippmar-nr
Copy link
Copy Markdown
Member

@tippmar-nr tippmar-nr commented Apr 6, 2026

Summary

  • Reset Couchbase admin password on every container start (not just first run) to prevent auth desync after restarts
  • Add liveness probe to auto-restart pods when Couchbase enters a degraded state
  • Add readiness probe to prevent tests from hitting the service before it's fully initialized

Addresses recurring Couchbase AuthenticationFailureException failures in unbounded tests (e.g. run 23430397409).

Successful unbounded test run after deploying these updates: https://github.com/newrelic/newrelic-dotnet-agent/actions/runs/24048892427/job/70140748602

Test plan

  • Deploy to K8s via unbounded services workflow
  • Verify Couchbase pod starts cleanly and probes pass
  • Run Couchbase unbounded tests against the updated service

…reset on restart

Move admin password reset outside the one-time guard file check so it
runs on every container start, preventing auth failures when the sandbox
image reinitializes after a restart. Add liveness and readiness probes
to auto-detect and recover from degraded Couchbase state.
@tippmar-nr tippmar-nr temporarily deployed to integration-test April 6, 2026 19:53 — with GitHub Actions Inactive
envsubst was replacing $COUCHBASE_ADMINISTRATOR_PASSWORD in the probe
commands with an empty string, since that variable is a K8s secret not
a CI environment variable. Restrict envsubst to only substitute the
three variables used in Service annotations: RESOURCE_GROUP,
PUBLIC_IP_NAME, and PUBLIC_IP.
@tippmar-nr tippmar-nr deployed to integration-test April 6, 2026 20:02 — with GitHub Actions Active
@tippmar-nr tippmar-nr marked this pull request as ready for review April 6, 2026 20:26
@tippmar-nr tippmar-nr requested a review from a team as a code owner April 6, 2026 20:26
@tippmar-nr tippmar-nr merged commit b7a57c5 into main Apr 6, 2026
134 of 137 checks passed
@tippmar-nr tippmar-nr deleted the ci/couchbase-resilience branch April 6, 2026 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants