Skip to content

feat: support distroless images in version probe#195

Merged
GrigoryPervakov merged 4 commits into
ClickHouse:mainfrom
ashishch432:fix/distroless
Jun 17, 2026
Merged

feat: support distroless images in version probe#195
GrigoryPervakov merged 4 commits into
ClickHouse:mainfrom
ashishch432:fix/distroless

Conversation

@ashishch432

@ashishch432 ashishch432 commented May 22, 2026

Copy link
Copy Markdown
Contributor

Why

The operator generates probe jobs using below which will fail with distroless images.

sh -c "<binary> --version > /dev/termination-log 2>&1"

What

Replace the command with:

/usr/bin/clickhouse local --query "INSERT INTO FUNCTION file('/dev/termination-log', 'RawBLOB', 'version String') SELECT version()"

Also remove the Keeper probe as discussed.

  • Use /usr/bin/clickhouse local for ClickHouseCluster probe.

  • Use INSERT INTO FUNCTION file(...) instead of INTO OUTFILE.
    Manual Kubernetes validation showed INTO OUTFILE '/dev/termination-log' fails due to ClickHouse temp-file/rename behavior against the Kubernetes termination log path, while file('/dev/termination-log', ...) writes successfully.

Related Issues

We are planning to switch to distroless images once ClickHouse/ClickHouse#105677 is fixed.

@ashishch432 ashishch432 marked this pull request as draft May 22, 2026 21:42
@ashishch432

ashishch432 commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

Hi @GrigoryPervakov ,

I opened this draft PR to make version probe Jobs compatible with shell-free/distroless ClickHouse images.

The initial plan was straightforward: replace the current shell-dependent command:

sh -c "<binary> --version > /dev/termination-log 2>&1"

with a shell-free ClickHouse-native command:

/usr/bin/clickhouse local --query \
  "INSERT INTO FUNCTION file('/dev/termination-log', 'RawBLOB', 'version String') SELECT version()"

That works for the official distroless server and keeper images I tested, and this is what the draft PR intitially implemented (current version has the 2 path implementation). While ClickHouse/ClickHouse#105677 means the current published distroless images are not fully shell-free yet, this command should continue to work once that image issue is fixed.

However, the assumption that standard images have a superset of distroless tooling turned out to be backwards here. This command breaks on standard clickhouse-keeper images: they expose /usr/bin/clickhouse-keeper, but not necessarily /usr/bin/clickhouse.

Based on ClickHouse/ClickHouse#98664, ClickHouse/ClickHouse@739b60df502, and ClickHouse/ClickHouse@a05ed90e43c, this appears intentional. Distroless Keeper includes clickhouse-common-static and /usr/bin/clickhouse; standard Keeper uses the standalone Keeper package.

I also looked for a more universal ClickHouse-native way to write the version directly to /dev/termination-log. The closest option I found is still clickhouse local with the file() table function. --version itself only writes to stdout, and I did not find a supported --version-output-path style flag. INTO OUTFILE was also tested, but failed around /dev/termination-log file handling.

Alternatives

Option Shape Pros Tradeoff
Read probe output from pod logs Run clickhouse-server --version / clickhouse-keeper --version directly, then parse bounded logs from the completed version-probe container. Keep termination-message parsing as a backward-compatible fallback for old Jobs. Same logic for standard and distroless images; no image flavor detection. Requires get on pods/log; read-only, but security-sensitive with the manager ClusterRole.
Keep termination-log flow and select by image flavor Use the current shell command for standard images and the clickhouse local ... file('/dev/termination-log') command for distroless images. Smaller change; no new RBAC. Needs either flavor detection, a user-provided flavor, or command/args override support.

For the second option, I see a few possible variants:

  • Auto-detect flavor from the image name/tag, for example distroless.
    This is convenient, but brittle for mirrored images or digest-pinned images.
  • Add an explicit user input such as imageFlavor: Standard | Distroless | Auto.
    This is clearer, but requires an API/CRD change.
  • Expose version probe command/args overrides and document the distroless command for users.
    This is a useful, but it would not fix official distroless images by default unless users configure it.

My preference is the pod-log option if pods/log access is acceptable. It gives us a cleaner default behavior and avoids both shell dependence and image-flavor guessing.

@GrigoryPervakov GrigoryPervakov changed the title refactor: switch version probe to use shell-free command feat: support distroless images in version probe May 26, 2026
@GrigoryPervakov

Copy link
Copy Markdown
Member

Sorry for the long review delay.
I think we should use your approach unconditionally.
And disable the keeper version probes for now, as it is not used for feature gating; probes from the running replicas are enough for the conditions

@GrigoryPervakov GrigoryPervakov self-assigned this Jun 10, 2026
@ashishch432

Copy link
Copy Markdown
Contributor Author

No problem, the original PR to fix distroless hasn't been closed yet either.
I hadn't considered the option to drop the keeper job, assuming we must need them for a reason, but makes sense in retrospec. That's perfect than, I'll use that, thanks.

@ashishch432 ashishch432 force-pushed the fix/distroless branch 2 times, most recently from fe2a0f0 to 71a6e06 Compare June 14, 2026 18:09
@ashishch432 ashishch432 marked this pull request as ready for review June 14, 2026 18:33
@ashishch432 ashishch432 marked this pull request as draft June 14, 2026 19:06
@ashishch432 ashishch432 force-pushed the fix/distroless branch 3 times, most recently from 79a702c to 8920250 Compare June 14, 2026 20:52
@ashishch432 ashishch432 marked this pull request as ready for review June 14, 2026 20:52
Comment thread api/v1alpha1/keepercluster_types.go
Comment thread api/v1alpha1/keepercluster_types.go

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the operator’s version detection to work with distroless ClickHouse images by switching the ClickHouseCluster version probe Job from a sh -c "<binary> --version > /dev/termination-log" pattern to running /usr/bin/clickhouse local --query ... that writes the version directly into the Kubernetes termination log. It also removes KeeperCluster version probe Jobs and derives Keeper version/upgrade conditions from live replica-reported versions instead.

Changes:

  • ClickHouseCluster version probe Job now runs /usr/bin/clickhouse local --query <INSERT INTO FUNCTION file('/dev/termination-log', ...)> (no shell), and the probe config no longer carries a binary name.
  • KeeperCluster no longer creates/owns version probe Jobs; it computes status.version, VersionInSync, and upgrade conditions from live replica versions, with new unit tests.
  • Documentation/CRD descriptions/Helm values updated to mark Keeper probe fields as deprecated; CI compatibility matrix adds a distroless ClickHouse version entry.

Reviewed changes

Copilot reviewed 13 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
internal/controller/versionprobe.go Switches probe execution to /usr/bin/clickhouse local --query ... and removes configurable binary from the probe config.
internal/controller/versionprobe_test.go Updates probe override/patch tests to assert the new command + args.
internal/controller/keeper/sync.go Removes Keeper probe Job usage and adds live-replica-based version/upgrade condition evaluation.
internal/controller/keeper/sync_test.go Adds tests for Keeper version status aggregation and upgrade condition behavior.
internal/controller/keeper/controller.go Drops Job ownership/RBAC since Keeper no longer creates version probe Jobs.
internal/controller/keeper/controller_test.go Updates integration expectations: no version probe Jobs created for Keeper, and version-upgrade condition not set when no checker.
internal/controller/clickhouse/sync.go Removes the now-deleted Binary field from the ClickHouse probe config callsite.
api/v1alpha1/keepercluster_types.go Marks Keeper versionProbeTemplate and versionProbeRevision as deprecated; clarifies status.version meaning.
docs/reference/api-reference.mdx Reflects Keeper probe deprecations and updated meaning of Keeper version fields.
docs/guides/configuration.mdx Updates documentation to describe ClickHouse Job probing vs Keeper live version reporting; reformats multiple tables.
config/crd/bases/clickhouse.com_keeperclusters.yaml CRD schema description updates for Keeper probe deprecation text.
dist/chart/templates/crd/keeperclusters.clickhouse.com.yaml Helm CRD template updated with Keeper probe deprecation descriptions.
dist/chart-cluster/values.yaml Notes Keeper versionProbeTemplate as deprecated in values comments.
config/manifests/bases/clickhouse-operator.clusterserviceversion.yaml CSV descriptions updated for Keeper version fields and probe deprecations.
.github/workflows/ci.yaml Adds 26.3-distroless to the compatibility test ClickHouse version matrix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/controller/keeper/sync.go Outdated
@GrigoryPervakov GrigoryPervakov enabled auto-merge (squash) June 17, 2026 15:54
@GrigoryPervakov GrigoryPervakov merged commit 6e96f8e into ClickHouse:main Jun 17, 2026
23 checks passed
@ashishch432

Copy link
Copy Markdown
Contributor Author

Cheers mate, pleasure working with you

@mintlify

mintlify Bot commented Jun 17, 2026

Copy link
Copy Markdown

Docs PR opened: ClickHouse/mintlify-docs-dev#217

Synced the upstream clickhouse-operator docs, updating eight existing pages and adding new monitoring, scaling, and TLS guides.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants