feat: add version probe job overrides#153
feat: add version probe job overrides#153ashishch432 wants to merge 2 commits intoClickHouse:mainfrom
Conversation
1ba27e4 to
7bd0bb3
Compare
There was a problem hiding this comment.
Pull request overview
Adds a versionProbe override mechanism to let users customize the operator’s version-probe Kubernetes Jobs (e.g., disable Istio sidecar injection) and inherits additional scheduling-related fields from the cluster PodTemplate.
Changes:
- Introduces
spec.versionProbe(JobTemplateSpec) on ClickHouse/Keeper CRDs and passes it into version probe reconciliation. - Implements strategic-merge-patch-based overrides for the generated version probe Job plus additional PodTemplate scheduling inheritance.
- Adds unit/integration tests and updates generated CRDs + API reference docs.
Reviewed changes
Copilot reviewed 12 out of 14 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/controller/versionprobe.go | Adds VersionProbe override support (SMP) and inherits more PodTemplate scheduling fields for the probe Job. |
| internal/controller/versionprobe_test.go | New unit tests validating override merge semantics. |
| internal/controller/keeper/sync.go | Wires spec.versionProbe into the version probe config for Keeper. |
| internal/controller/keeper/controller_test.go | Adds integration test asserting override propagation + scheduling inheritance. |
| internal/controller/clickhouse/sync.go | Wires spec.versionProbe into the version probe config for ClickHouse. |
| internal/controller/clickhouse/controller_test.go | Adds integration test asserting override propagation + scheduling inheritance. |
| api/v1alpha1/clickhousecluster_types.go | Adds VersionProbe *batchv1.JobTemplateSpec to ClickHouseClusterSpec with schemaless preservation. |
| api/v1alpha1/keepercluster_types.go | Adds VersionProbe *batchv1.JobTemplateSpec to KeeperClusterSpec with schemaless preservation. |
| api/v1alpha1/zz_generated.deepcopy.go | Updates deep-copies for the new VersionProbe field. |
| docs/api_reference.md | Documents the new versionProbe field for both cluster specs. |
| config/crd/bases/clickhouse.com_clickhouseclusters.yaml | CRD schema update for spec.versionProbe. |
| config/crd/bases/clickhouse.com_keeperclusters.yaml | CRD schema update for spec.versionProbe. |
| dist/chart/templates/crd/clickhouseclusters.clickhouse.com.yaml | Helm chart CRD template update for spec.versionProbe. |
| dist/chart/templates/crd/keeperclusters.clickhouse.com.yaml | Helm chart CRD template update for spec.versionProbe. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7bd0bb3 to
baac577
Compare
| Affinity: cfg.PodTemplate.Affinity, | ||
| ServiceAccountName: cfg.PodTemplate.ServiceAccountName, | ||
| TopologySpreadConstraints: cfg.PodTemplate.TopologySpreadConstraints, |
There was a problem hiding this comment.
I can't see a reason for the probe pod to be scheduled with the server's Affinity or TopologySpread settings. ServiceAccount also seems to be useless for it.
So I'd prefer to inherit only the required parameters from the server Pods and override the rest with the new field. Same for PriorityClassName/RuntimeClassName
There was a problem hiding this comment.
Agreed on the fact some of the overrides aren't needed. For Service Account I remember doing it because imagePullSecrets may be tied to an SA, but super less likely practically with dedicated secret being present. A NodeSelector and Toleration may be reasonable defaults to inherit additionally - when choosing certain Node Pool to run the ClickHouse servers on.
But yeah, sensible to just inherit just the existing required parameters, will revert.
| // Merge metadata: Pod-level labels/annotations. | ||
| job.Spec.Template.Labels = controllerutil.MergeMaps(job.Spec.Template.Labels, override.Labels) | ||
| job.Spec.Template.Annotations = controllerutil.MergeMaps(job.Spec.Template.Annotations, override.Annotations) |
There was a problem hiding this comment.
Why is Job-level metadata merged into Pod metadata?
There was a problem hiding this comment.
Relic of the old PR, the intention was that labels and annotations can be set at both job and pod level and simple solution would be just to replicate it to both, instead of having 2 dedicate label and annotation overrides for the job and the pod.
If we go with a custom type, what do you think: Should label and annotation overrides just apply to the pod or the job or both. Or we should have overrides for each.
For example with Istio, labels and annotations are only work at the pod level (spec.template.metadata).
I'm not sure of use-cases for overriding the job level metadata.
| // +kubebuilder:validation:Schemaless | ||
| // +kubebuilder:pruning:PreserveUnknownFields |
There was a problem hiding this comment.
I'm not sure about Schemaless fields. I'd prefer to introduce our own types, copying k8s structure, but with only supported fields. With SMP overrides, it should be easy to add new fields later
There was a problem hiding this comment.
I see. In the previous PR we were looking to use batchv1.JobTemplateSpec as override, so went ahead with that.
But it's fine either way, for a simple version probe job it doesn't need too much flexibility.
Which fields do you think we should support in the custom type.
I can think of: Labels, Annotations (See the other comment on whether labels and annotations need to be job's or pod's), Resources. NodeSelector, Tolerations to start with.
Why
Job metadata override and additional template spec inheritance
What
Adds a
versionProbefield (typebatchv1.JobTemplateSpec) to bothClickHouseClusterSpecandKeeperClusterSpec. User-provided values are applied as a strategic merge patch on top of the operator-generated Job, allowing overrides for any Job/Pod field:ttlSecondsAfterFinished, etc.)Scheduling fields (
nodeSelector,tolerations,affinity,serviceAccountName, etc.) are also inherited from the cluster'sPodTemplate.Example
Related Issues
Fixes #128