Skip to content

ci: fix OCI release artifact deploys applying wrong image version overlay#6259

Closed
hisImminence wants to merge 1 commit into
mainfrom
fix/oci-chart-no-image-overlay
Closed

ci: fix OCI release artifact deploys applying wrong image version overlay#6259
hisImminence wants to merge 1 commit into
mainfrom
fix/oci-chart-no-image-overlay

Conversation

@hisImminence

Copy link
Copy Markdown
Contributor

Problem

When helmChartVersion is set (OCI deploy, e.g. 13.10.0-rc), the runner should deploy the chart exactly as released — no external image version overlay — so the chart's own values.yaml is the sole source of truth.

Two bugs prevented this:

  1. entry.ImageTags=true is hardcoded in all qa-* scenarios (ci-test-config.yaml). This blocked overlay selection entirely, causing the runner to skip both values-digest.yaml and values-latest.yaml, but only by accident — with no OCI awareness.

  2. TEST_IMAGE_TAGS env var was set in the workflow but never consumed. include-image-tags=false had zero effect on the actual deploy — the runner never received the signal.

This is a follow-up to #6253 addressing the review feedback from @bkenez: applying values-latest.yaml from the git checkout onto an OCI artifact is as wrong as injecting SNAPSHOTs — the git checkout drifts from what was baked into the RC.

Fix

runner.go — adds DisableImageTags bool to RunOptions (same runtime-override pattern as UseQA), extracts a named chartRootOverlays(entry, opts) helper, and adds an OCI short-circuit:

func chartRootOverlays(entry Entry, opts RunOptions) []string {
    // enterprise is composable; digest/latest/OCI are mutually exclusive
    if !effectiveImageTags(entry.ImageTags, opts) {
        if opts.ChartRef != "" {
            // OCI artifact: chart's values.yaml is authoritative — no overlay
        } else if opts.UseLatest {
            overlays = append(overlays, "latest")
        } else {
            overlays = append(overlays, "digest")
        }
    }
}

This also removes the 19-line IIFE closure in executeEntry and deduplicates it with resolveChartRootOverlaysQuiet, so all three callers (execute, dry-run, coverage) share one code path.

matrix.go — adds --disable-image-tags CLI flag wired to DisableImageTags.

test-integration-runner.yaml — wires the existing TEST_IMAGE_TAGS env var (previously dead) to --disable-image-tags, for both the install and upgrade blocks.

Flag decision table

For qa-* scenarios (all have image-tags: true in ci-test-config.yaml):

--disable-image-tags --chart-ref set --use-latest Overlay applied Use case
any any none — env-file injects base-image-tags.yaml separately Nightly SNAPSHOT CI
any none RC release testing ← the fix
values-latest.yaml Local chart, pinned release tags
values-digest.yaml Local chart, digest-pinned SNAPSHOTs

In the workflow:

  • TEST_IMAGE_TAGS=true (default, nightly) → no --disable-image-tags → SNAPSHOT path unchanged
  • TEST_IMAGE_TAGS=false + helmChartVersion--disable-image-tags + --chart-ref/--chart-version → no overlay

Test plan

  • Run SM On-Demand 8.8+ Install with include-image-tags=false and helmChartVersion=13.10.0-rc — verify pods use the chart's own image versions (e.g. camunda/camunda:8.8.24), not SNAPSHOTs and not git-checkout overrides
  • Run same workflow with include-image-tags=true and SNAPSHOT inputs — verify existing nightly behaviour unchanged
  • go test ./matrix/... -run "TestEffectiveImageTags |TestChartRootOverlays" passes (14 new cases)

🤖 Generated with Claude Code

…rlay

When helmChartVersion is set (OCI deploy), the runner was either injecting
SNAPSHOT tags from base-image-tags.yaml or falling through to values-digest.yaml
— both wrong. The chart's own values.yaml should be the sole source of truth.

Root causes:
- entry.ImageTags=true hardcoded in all qa-* scenarios blocked overlay selection
- TEST_IMAGE_TAGS env var was set in the workflow but never consumed
- No runtime mechanism to override per-scenario image-tags config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 29, 2026 00:40
@hisImminence hisImminence requested a review from a team as a code owner May 29, 2026 00:40
@hisImminence hisImminence requested review from Ian-wang-liyang and removed request for a team May 29, 2026 00:40

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes OCI release artifact deploys incorrectly applying chart-root image version overlays (values-digest.yaml / values-latest.yaml) from the local git checkout. Adds a DisableImageTags runtime override that trumps the per-scenario image-tags: true flag, and an OCI short-circuit so no overlay is applied when --chart-ref is set. Also wires the previously-dead TEST_IMAGE_TAGS env var in the workflow to a new --disable-image-tags CLI flag.

Changes:

  • Add RunOptions.DisableImageTags + effectiveImageTags helper applied at all 4 entry.ImageTags sites; extract shared chartRootOverlays helper with OCI short-circuit.
  • Add --disable-image-tags CLI flag.
  • Wire TEST_IMAGE_TAGS=false to --disable-image-tags in install/upgrade workflow blocks; add unit tests for the two new helpers.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
scripts/deploy-camunda/matrix/runner.go Adds DisableImageTags option, effectiveImageTags helper, extracted chartRootOverlays with OCI short-circuit.
scripts/deploy-camunda/matrix/matrix_test.go Unit tests for effectiveImageTags and chartRootOverlays.
scripts/deploy-camunda/cmd/matrix.go Adds --disable-image-tags CLI flag.
.github/workflows/test-integration-runner.yaml Wires TEST_IMAGE_TAGS=false to --disable-image-tags in both install and upgrade blocks.

@hisImminence

Copy link
Copy Markdown
Contributor Author

Closing in favour of #6258 by @bkenez which takes a cleaner approach: OCI immutability is activated automatically from --chart-ref (no extra flag required from the workflow), adds a --force-image-overrides escape hatch, and handles the VALUES_CONFIG partial-sanitization case (stripping *_IMAGE_TAG keys while preserving non-image config like index prefixes). Our --disable-image-tags approach requires the workflow to always pass an extra flag in sync with the OCI ref, and drops the entire env-file rather than stripping only image tag keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants