Skip to content

Reconcile remote datacenters independently #2535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

zimnx
Copy link
Collaborator

@zimnx zimnx commented Mar 5, 2025

Refactors ScyllaDBCluster controller to reconcile each datacenter independenly.
Errors or connection issues of down datacenter no longer affects reconcilation of other datacenter.

Requires:

Fixes #2494

Copy link
Contributor

@zimnx: GitHub didn't allow me to request PR reviews from the following users: zimnx.

Note that only scylladb members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

TODO

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@scylla-operator-bot scylla-operator-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 5, 2025
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zimnx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@scylla-operator-bot scylla-operator-bot bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 5, 2025
@zimnx zimnx force-pushed the independent-datacenters branch from b3235c0 to 433e060 Compare March 5, 2025 16:46
@scylla-operator-bot scylla-operator-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 5, 2025
@zimnx zimnx force-pushed the independent-datacenters branch 2 times, most recently from 057822f to 61cac51 Compare March 10, 2025 16:21
@scylla-operator-bot scylla-operator-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2025
@zimnx zimnx force-pushed the independent-datacenters branch from 61cac51 to 110646f Compare March 18, 2025 10:53
@scylla-operator-bot scylla-operator-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 18, 2025
@zimnx zimnx force-pushed the independent-datacenters branch from 110646f to f8a89a1 Compare March 28, 2025 12:47
@zimnx zimnx force-pushed the independent-datacenters branch 2 times, most recently from d4b239d to 978ed02 Compare April 9, 2025 13:43
@zimnx zimnx changed the title [WIP] Independently reconciled datacenters Reconcile remote datacenters independently Apr 9, 2025
@scylla-operator-bot scylla-operator-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 9, 2025
@zimnx zimnx added kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Apr 9, 2025
@scylla-operator-bot scylla-operator-bot bot removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Apr 9, 2025
@zimnx zimnx changed the title Reconcile remote datacenters independently [WIP] Reconcile remote datacenters independently Apr 9, 2025
@scylla-operator-bot scylla-operator-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 9, 2025
@zimnx zimnx force-pushed the independent-datacenters branch 4 times, most recently from d651b70 to 9519be5 Compare April 14, 2025 16:42
@zimnx zimnx force-pushed the independent-datacenters branch 5 times, most recently from 5487493 to 15fbeb0 Compare April 16, 2025 19:06
@zimnx zimnx changed the title [WIP] Reconcile remote datacenters independently Reconcile remote datacenters independently Apr 17, 2025
@scylla-operator-bot scylla-operator-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 17, 2025
@zimnx
Copy link
Collaborator Author

zimnx commented Apr 17, 2025

Last three commits are ready to review, the other two are extracted into separate PRs:
#2599
#2600

/cc rzetelskik mflendrich

@zimnx zimnx force-pushed the independent-datacenters branch from 15fbeb0 to 606e28a Compare April 22, 2025 12:53
@zimnx
Copy link
Collaborator Author

zimnx commented Apr 22, 2025

clean rebase to pick up dependent PRs, ready to review

Copy link
Collaborator

@mflendrich mflendrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My overall perspective - not reflected in the attached comments - is that the MakeRemoteXXX functions encapsulate significant logic in a way that (1) is not clear from the function name/signature, (2) vastly differs between types.

Attaching some comments, but in subsequent rounds I would like to get the function signatures (including mostly nonexistent docstrings) more in check with what actually the logic is.

}

func MakeRemoteEndpointSlices(sc *scyllav1alpha1.ScyllaDBCluster, remoteNamespaces map[string]*corev1.Namespace, remoteControllers map[string]metav1.Object, remoteServiceLister remotelister.GenericClusterLister[corev1listers.ServiceLister], remotePodLister remotelister.GenericClusterLister[corev1listers.PodLister], managingClusterDomain string) ([]metav1.Condition, map[string][]*discoveryv1.EndpointSlice, error) {
func MakeRemoteEndpointSlices(sc *scyllav1alpha1.ScyllaDBCluster, dc *scyllav1alpha1.ScyllaDBClusterDatacenter, remoteNamespace *corev1.Namespace, remoteController metav1.Object, remoteNamespaces map[string]*corev1.Namespace, remoteServiceLister remotelister.GenericClusterLister[corev1listers.ServiceLister], remotePodLister remotelister.GenericClusterLister[corev1listers.PodLister], managingClusterDomain string) ([]metav1.Condition, []*discoveryv1.EndpointSlice, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In human speak, what does this function do? We're adding arguments to an already outrageous signature (8 arguments total) and I'm struggling to wrap my head around the semantics.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In human speak, what does this function do?

It returns required remote EndpointSlices objects reconciled by ScyllaDBCluster controller.

We're adding arguments to an already outrageous signature (8 arguments total) and I'm struggling to wrap my head around the semantics.

There's one new argument - the DC which required objects should be returned. Previously these functions were returning a map of required objects with DC as a key. After the change, instead of map, we accept DC as a parameter and return required objects only for the DC requested.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns required remote EndpointSlices objects reconciled by ScyllaDBCluster controller.

I carefully looked into the implementation as this explanation didn't help me understand what the semantics of this function were. This description and the location sounds as if this function was something as simple as a getter, while the implementation does way way way more:

  • for each non-local DC:
    • computes a set of labels for the remote EndpointSlice (take from the scyllacluster or fall back to a hardocded default)
    • populate the conventional protocols/ports
    • computes the effective value of broadcast address type (from the cluster spec or the default) and based on it has completely different logic (just once on startup)
    • has two different strategies for pod/LB broadcast types
      • for podIP-based, go look into the respective pods and hardwire the IPs (effectively almost "reimplement kube-proxy")
      • for ingress-based, go look in the ingress status to assemble the host address
      • in each of these cases, include readiness changes in the result

The code health problem I'm trying to point at stems from the dissonance between the (overwhelming) complexity and the presentation of this function as a mere getter for a list of objects.

I realize that this existed before this PR (however is only relevant for multi-DC).

In a broader context (beyond this PR) this kind of dissonance should not be happening, and my expectation from newly written code is that pieces of logic are separated in a way that preserves unit-testability and SRP, where possible.

The main takeaway from me is that it took solid 30 minutes of reading to (1) understand this function "before" and (2) understand the change in this function. Even if the reader is new to the codebase (like myself) this is a problem.

Based on this (somewhat lengthy) rationale, the code health expectation is that, given a function, it's clear to the reader what the function does. I would not go as far as enforcing a rule that every func must have a docstring, but set an expectation that if a glance at the name/signature (or a short func body) isn't enough to comprehend the semantics, then something is wrong.

There is no code change required in this comment - I'm seeking shared understanding. Adding a docstring or splitting up the logic / making reasonable parts of it unit-testable sound like valid code health improvements in the current situation, but the fact that the situation was created earlier on and this PR only touches it "a little bit" would make such a request awkward.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of if is part of building required object. Common parts could be extracted into shared functions - I'm all for it, but I don't see benefit of extracting every single block into separate function used in one place just for sake of principle.

As discussed offline, I'm not in favor of refactoring any logic other than what this PR has in title, so leaving as is.

remoteConfigMapControllerDegradedCondition = "RemoteConfigMapControllerDegraded"
remoteSecretControllerProgressingCondition = "RemoteSecretControllerProgressing"
remoteSecretControllerDegradedCondition = "RemoteSecretControllerDegraded"
remoteRemoteOwnerControllerDatacenterProgressingConditionFormat = "RemoteRemoteOwnerControllerDatacenter%sProgressing"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting the format argument in the middle of the name doesn't sit right with me. Of all the reasons, it makes grepping very awkward.

Is there a reason why it's not e.g. RemoteXXXProgressing-DC-%s ? (I'm not sure if - is legal in a condition name, but you get the idea)

The suggestion would be to keep the condition names as they are, and wrap the condition name generator logic in a generic function like return fmt.Sprintf("%s-DC-%s", condName, dcName)

Copy link
Collaborator Author

@zimnx zimnx Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aggregate conditions based on their suffix. The ones having *DC*Progressing suffix are aggregated into DC*Progressing condition and ones having *Progressing are aggreated into final Progressing condition, same with Degraded and Available.
Having it in the middle is more readable imo, as currently it reads as a sentence: RemoteNamespaceController(for)DatacenterXYZ(is)Progressing and is consistent with other types and their conditions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've started looking into why this didn't sit right with me and did some more research, I arrived at the below conclusions (and would like to change the request of this comment as follows at the end).

This list is effectively an expanded Cartesian product

["Remote"]
x
["RemoteOwner", "ScyllaDBDatacenter", "Namespace", "Service", "EndpointSlice", "Endpoints", "ConfigMap", "Secret"]x
["ControllerDatacenter"]
x
["%s" (to be filled in on a different level, but contributes to cardinality)] 
x
["Progressing", "Degraded"]

and any place that interacts with this list which means at least:

  • the ladder of RunSync calls
  • the ladder of SyncRemoteNamespacedObject calls
  • the ladder of condition types (suffixes) in aggregateDatacenterStatusConditions

need to be unrolled, leading to substantial repetition of logic, which I see as difficult to accept from the maintainability point of view:

  • over time leading to slight divergence in logic between cases that should be solved by design instead (note the "Available" case in aggregateDatacenterStatusConditions),
  • making unit-testing of the logic in a general case not possible (or at least much more difficult because there is no pattern),
  • requiring ongoing maintenance of all instances of the unrolled Cartesian product (e.g. if an additional value gets added) which also poses the risk of ommissions.

As said in the 1st paragraph, I arrived at a conclusion that the order of the fields in the product is not the biggest issue here - it's that the unrolling/repetition is necessary all over the place.

This problem predates this specific PR, but is prevalent across the ScyllaDBCluster implementation (note an earlier discussion on this topic #2576).

To make it actionable in a single PR and to keep my overall goal in mind (which is to curb the cardinality explosion of just about every piece of logic) reasonable, I suggest that:

  • cardinality explosion in the list of condition names be resolved: by breaking up into chunks according to the algebra above (and assembled at call site)
  • no (or little) new logic copy-pasta caused by the necessity of unrolling be added in this PR (the aggregateDatacenterStatusConditions looks like a good example)

I'm happy to offer some refactoring energy if that proves very complex to do on the code written so far.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored conditions added in this PR to suggested "abstraction".

@@ -414,6 +422,55 @@ func (scc *Controller) sync(ctx context.Context, key string) error {
return utilerrors.NewAggregate(errs)
}

func (scc *Controller) aggregateDatacenterStatusConditions(generation int64, conditions *[]metav1.Condition, dc *scyllav1alpha1.ScyllaDBClusterDatacenter) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the signature of this function is broader than necessary:

  • the receiver is unused; suggestion: make it a free function instead of a method
  • this function does not use the entire dc, just dc.Name; suggestion: replace the dc argument with dcName string

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer there, I reused existing function aggregating condition by their suffixes

remoteConfigMapControllerDegradedCondition = "RemoteConfigMapControllerDegraded"
remoteSecretControllerProgressingCondition = "RemoteSecretControllerProgressing"
remoteSecretControllerDegradedCondition = "RemoteSecretControllerDegraded"
remoteRemoteOwnerControllerDatacenterProgressingConditionFormat = "RemoteRemoteOwnerControllerDatacenter%sProgressing"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've started looking into why this didn't sit right with me and did some more research, I arrived at the below conclusions (and would like to change the request of this comment as follows at the end).

This list is effectively an expanded Cartesian product

["Remote"]
x
["RemoteOwner", "ScyllaDBDatacenter", "Namespace", "Service", "EndpointSlice", "Endpoints", "ConfigMap", "Secret"]x
["ControllerDatacenter"]
x
["%s" (to be filled in on a different level, but contributes to cardinality)] 
x
["Progressing", "Degraded"]

and any place that interacts with this list which means at least:

  • the ladder of RunSync calls
  • the ladder of SyncRemoteNamespacedObject calls
  • the ladder of condition types (suffixes) in aggregateDatacenterStatusConditions

need to be unrolled, leading to substantial repetition of logic, which I see as difficult to accept from the maintainability point of view:

  • over time leading to slight divergence in logic between cases that should be solved by design instead (note the "Available" case in aggregateDatacenterStatusConditions),
  • making unit-testing of the logic in a general case not possible (or at least much more difficult because there is no pattern),
  • requiring ongoing maintenance of all instances of the unrolled Cartesian product (e.g. if an additional value gets added) which also poses the risk of ommissions.

As said in the 1st paragraph, I arrived at a conclusion that the order of the fields in the product is not the biggest issue here - it's that the unrolling/repetition is necessary all over the place.

This problem predates this specific PR, but is prevalent across the ScyllaDBCluster implementation (note an earlier discussion on this topic #2576).

To make it actionable in a single PR and to keep my overall goal in mind (which is to curb the cardinality explosion of just about every piece of logic) reasonable, I suggest that:

  • cardinality explosion in the list of condition names be resolved: by breaking up into chunks according to the algebra above (and assembled at call site)
  • no (or little) new logic copy-pasta caused by the necessity of unrolling be added in this PR (the aggregateDatacenterStatusConditions looks like a good example)

I'm happy to offer some refactoring energy if that proves very complex to do on the code written so far.

@scylla-operator-bot scylla-operator-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 23, 2025
Refactors ScyllaDBCluster controller to reconcile each datacenter independenly.
Errors or connection issues of down datacenter no longer affects reconcilation of other datacenter.
@zimnx zimnx force-pushed the independent-datacenters branch from 606e28a to 645b2c5 Compare April 24, 2025 19:51
@scylla-operator-bot scylla-operator-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 24, 2025
@zimnx zimnx requested a review from mflendrich April 24, 2025 20:02
zimnx added 2 commits April 25, 2025 10:32
…center

Existing Controller Conditions were changed to represent status of reconcilation of particular Datacenter.
On top of these, aggregated per Datacenter Conditions were added.
Test verifies if Operator is able to reconcile healthy datacenters when any is down, and whether it fully reconciles the cluster when down DC is restored.
@zimnx zimnx force-pushed the independent-datacenters branch from 645b2c5 to 2a9c192 Compare April 25, 2025 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Multi-DC] Reconcile remote datacenters independently
2 participants