chores: reduce amount of reconciles, cache CH connections between reconciles#245
Open
GrigoryPervakov wants to merge 1 commit into
Open
chores: reduce amount of reconciles, cache CH connections between reconciles#245GrigoryPervakov wants to merge 1 commit into
GrigoryPervakov wants to merge 1 commit into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR reduces unnecessary reconcile churn in transient states and avoids repeatedly reconnecting to ClickHouse nodes by (1) slowing down common poll/requeue loops and (2) introducing a per-cluster ClickHouse connection cache that survives across reconciles and is cleaned up on manager shutdown.
Changes:
- Introduce a per-
ClickHouseClusterconnection pool (connCache) and wire it through the ClickHouse controller/commander, including shutdown cleanup. - Increase several requeue/poll intervals from
1sto5sby addingRequeueProbePolland using it in multiple “wait/poll” paths. - Update integration tests to use the new dialer + connection cache flow.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/controller/resourcemanager.go | Switch replica-resource requeues to RequeueProbePoll to reduce apiserver churn. |
| internal/controller/keeper/sync.go | Slow down requeueing when cluster is not stable / waiting on replica readiness. |
| internal/controller/constants.go | Add RequeueProbePoll = 5s constant. |
| internal/controller/clickhouse/sync.go | Wire connection cache into commander creation; use RequeueProbePoll for blocked/requeue steps. |
| internal/controller/clickhouse/controller.go | Create/own the connection cache, evict on cluster deletion, and close on manager shutdown. |
| internal/controller/clickhouse/conncache.go | New per-cluster connection cache implementation with credential-hash invalidation. |
| internal/controller/clickhouse/commands.go | Commander now uses cached connections rather than per-reconcile connection maps. |
| internal/controller/clickhouse/commands_test.go | Adapt integration test to use dialer routing and cached connections. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+62
to
69
| func newCommander(log controllerutil.Logger, cluster *v1.ClickHouseCluster, secret *corev1.Secret, dialer controllerutil.DialContextFunc, cache *connCache) *commander { | ||
| log = log.Named("commander") | ||
| credHash, _ := controllerutil.DeepHashObject(secret.Data[SecretKeyManagementPassword]) | ||
|
|
||
| return &commander{ | ||
| log: log.Named("commander"), | ||
| conns: map[v1.ClickHouseReplicaID]clickhouse.Conn{}, | ||
| log: log, | ||
| entry: cache.Get(cluster.NamespacedName(), credHash, log), | ||
| cluster: cluster, |
Comment on lines
75
to
80
| if errors.IsNotFound(err) { | ||
| cc.Logger.Info("clickhouse cluster not found") | ||
| cc.connCache.Evict(req.NamespacedName, cc.Logger) | ||
|
|
||
| return ctrl.Result{}, nil | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
In transient states, the operator does a lot of reconciling, generating useless load on the apiserver and constantly reconnect all nodes.
What