Skip to content

test(8.10): drive Bitnami→companion migration from a post-infra hook before upgrade#6343

Open
eamonnmoloney wants to merge 9 commits into
post-infra-lifecycle-hookfrom
post-infra-bitnami-migration-wiring
Open

test(8.10): drive Bitnami→companion migration from a post-infra hook before upgrade#6343
eamonnmoloney wants to merge 9 commits into
post-infra-lifecycle-hookfrom
post-infra-bitnami-migration-wiring

Conversation

@eamonnmoloney

Copy link
Copy Markdown
Contributor

Which problem does the PR fix?

Completes the Helm-CI side of the 8.9→8.10 Bitnami-removal upgrade: it makes the
nightly actually exercise the real Bitnami→external migration scripts, so we
verify a customer on bundled (Bitnami) Keycloak/PostgreSQL in 8.9 can reach 8.10
with their realm/users intact — instead of testing a synthetic "already external"
topology.

What's in this PR?

Wires the qa-elasticsearch-upg modular-upgrade-minor scenario to a
post-infra lifecycle hook (added in #6342):

  • pre-setup-scripts/post-infra-bitnami-migration.sh — clones
    camunda-deployment-references, creates the external-target secrets, and runs
    the migration scripts in external mode with SKIP_HELM_UPGRADE=true.
    It migrates the bundled Keycloak realm + Identity/WebModeler PostgreSQL data
    onto the companion services (postgresql / keycloak), then returns so the
    matrix runner performs the chart upgrade to 8.10 pointing at the companions.
  • ci-test-config.yaml — the post-infra: declaration on the scenario.

Sequence: install 8.9 + bundled Bitnami → deploy companions → post-infra hook
migrates data onto companions
→ runner upgrades to 8.10 → e2e verifies.

Status — DRAFT

  • go test ./matrix/... (incl. TestLifecycleFixtures, which validates the
    hook script reference + description + orphan check) passes; the script is
    bash -n clean and executable; the config parses.
  • Pinned to a branch: the references repo is pinned via
    CAMUNDA_DEPLOYMENT_REFERENCES_REF (default feat/keycloak-external-migration-target)
    until feat(migration): add KEYCLOAK_TARGET_MODE=external for migrating Bitnami Keycloak to an external instance camunda-deployment-references#2620 merges; switch to a tag/SHA then.
  • Not yet GKE-validated. Open wiring items noted inline in the hook script:
    1. companion Keycloak (keycloak-qa) must read the realm from the companion
      PostgreSQL the migration restores into (align companion-values);
    2. ES→external reindex isn't automated by the scripts yet (MIGRATE_ELASTICSEARCH=false),
      so secondary-storage data continuity is tracked separately.

Related / stacking

Checklist

Before opening the PR:

  • In the repo's root dir, run make go.update-golden-only.
  • There is no other open pull request for the same update/change.
  • Tests for charts are added (if needed).
  • In-repo documentation are updated (if needed).

After opening the PR:

  • Did you sign our CLA (Contributor License Agreement)?
  • Did all checks/tests pass in the PR?

eamonnmoloney and others added 2 commits June 5, 2026 09:40
Add a `post-infra` declarative lifecycle hook that fires after a scenario's
companion charts (the external infrastructure: PostgreSQL, Elasticsearch,
Keycloak, …) are deployed and ready, but BEFORE the main Camunda chart is
installed/upgraded. It fills the gap between `pre-install` (before anything)
and `post-deploy` (after the chart) — there was no point to act on
freshly-provisioned external infrastructure.

The motivating use case: migrating data from a prior release's bundled
(Bitnami) backends onto the companion services before the chart switches over
to them, so the upgraded chart finds its realm/data on the external infra.

- types.Options + deployer.Deploy: run PostInfraHooks after the companion
  charts loop, before upgradeInstall
- config.RuntimeFlags.PostInfraHooks; wired through deploy.Execute
- matrix: CIScenario.PostInfra / Entry.PostInfra (yaml `post-infra`), carried
  to the Entry; registerDeclarativePostInfraHook mirrors the pre-install/
  post-deploy shims; registered on the install path, the upgrade-only path,
  and Step 2 of two-step upgrades
- TestLifecycleFixtures now validates `post-infra` script/fixture references

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…upgrade

Wire the qa-elasticsearch-upg modular-upgrade-minor scenario to run the real
camunda-deployment-references migration scripts from a post-infra lifecycle
hook. After the companion PostgreSQL/Keycloak are deployed but before the chart
upgrade to 8.10 (which removes the bundled Bitnami subcharts), the hook migrates
the bundled Keycloak realm + Identity/WebModeler PostgreSQL data onto the
companions, in external mode with SKIP_HELM_UPGRADE=true (data-only — the runner
performs the chart upgrade). This keeps users/realm alive across the upgrade.

The references repo is pinned to the feature branch
(CAMUNDA_DEPLOYMENT_REFERENCES_REF, default feat/keycloak-external-migration-target)
until camunda/camunda-deployment-references#2620 merges; switch the default to a
tag/SHA then.

DRAFT — pending #2620 merge and GKE validation. Open wiring items are noted
inline in the hook script (ES external reindex not yet automated; companion
Keycloak must read the realm from the companion PostgreSQL).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the version/8.10 Camunda applications/cycle version label Jun 5, 2026
eamonnmoloney and others added 5 commits June 5, 2026 11:08
On-cluster fixes from a live GKE run of the Bitnami->companion migration:
- read the Keycloak admin password from integration-test-credentials (the fixed
  CI secret name), not <release>-credentials
- target the companion Keycloak's own bundled PG (keycloak-postgresql) for the
  realm restore, with its keycloak-ci-password, so the restore lands where the
  companion Keycloak reads from
- source the references env.sh so the phase scripts get their defaults
  (CAMUNDA_HELM_CHART_VERSION etc.)

Validated end-to-end on GKE: the post-infra hook fires at the right point,
clones the references branch, creates the external-target secrets, and the
migration passes Phase 1 (external-target validation) and into Phase 2 backup.
…i layout

From live GKE inspection of the 8.9 bundled chart: only a Keycloak PG exists
(bitnami_keycloak/bn_keycloak); Identity has no PG (data in ES) and Web Modeler
reinitialises on boot. So migrate only the Keycloak realm:
- MIGRATE_IDENTITY=false, MIGRATE_WEBMODELER=false
- KEYCLOAK_SOURCE_DB_NAME/USER=bitnami_keycloak/bn_keycloak (source) ->
  keycloak/keycloak (companion target)
- restart the companion Keycloak after the restore so it loads the realm
…hook

The 8.10 'no access to this component' failures are caused by the
camunda-authorization/user/role indices (which gate Operate/Tasklist access)
living in the bundled Bitnami ES and not surviving the swap to the companion ES.
Enable the ES migration in the hook: MIGRATE_ELASTICSEARCH=true +
ES_TARGET_MODE=external + ES_WARM_REINDEX=true, reindex (unprefixed) camunda-*/
operate-*/optimize-*/tasklist-* from the bundled ES to the companion, and add
reindex.remote.whitelist to the companion ES. Validated on GKE: 51
camunda-authorization docs land in the companion ES.
Exercise the 8.9->8.10 upgrade path on the tier-1 Elasticsearch/Keycloak
scenario in PR CI alongside the existing install flow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The eske upgrade-minor cell failed because the upgraded 8.10 WebModeler/Identity
route at the CNPG companion but their databases never existed: Step 1 bootstraps
the cluster with the 8.9 fixture (no postInitSQL) and CNPG initdb runs only once,
so the 8.10 fixture's CREATE DATABASE never executes.

- 8.9 CNPG fixture: align with 8.10 — create the identity/webmodeler databases at
  bootstrap so an upgrade landing components on the cluster has them.
- eske scenario: wire the post-infra-bitnami-migration.sh hook so the bundled
  Bitnami Keycloak realm + ES authorization indices are migrated onto the
  companion backends before the upgrade removes the bundled subcharts — the
  customer-shaped Bitnami→external move.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the version/8.9 Camunda applications/cycle version label Jun 8, 2026
eamonnmoloney and others added 2 commits June 8, 2026 06:16
The eske upgrade-minor migration warm-reindexes Camunda indices from the bundled
Bitnami ES into the companion ES, but the companion rejected it
("[integration-elasticsearch...:9200] not whitelisted in reindex.remote.whitelist")
because the base companion-values/elasticsearch.yaml lacked the allow-list the
qa variant already sets. Add reindex.remote.whitelist: "*:9200" so the
Bitnami→companion ES migration can complete; no effect on install scenarios.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The post-infra migration hook ran for BOTH of eske's flows (install,upgrade-minor)
and failed the install cell: a fresh 8.10 install has no bundled-Bitnami source to
migrate. Lifecycle hooks aren't flow-scoped per scenario, so revert tier-1 `eske`
to install-only (pristine gate, no hook) and add `elasticsearch-upgrade` (eskeu,
upgrade-minor) carrying the hook — where post-infra fires only before Step 2.
prefix-key pins both steps' index prefixes/realm to `elasticsearch`. Mirrors the
existing qa-elasticsearch / qa-elasticsearch-upg split.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@eamonnmoloney eamonnmoloney marked this pull request as ready for review June 8, 2026 11:27
@eamonnmoloney eamonnmoloney requested a review from a team as a code owner June 8, 2026 11:27
@eamonnmoloney eamonnmoloney requested review from hisImminence and removed request for a team June 8, 2026 11:27
@eamonnmoloney eamonnmoloney force-pushed the post-infra-lifecycle-hook branch from af206b8 to d79f5c9 Compare June 10, 2026 05:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

version/8.9 Camunda applications/cycle version version/8.10 Camunda applications/cycle version

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants