Skip to content

[WIP] Reconcile remote datacenters independently #2535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

zimnx
Copy link
Collaborator

@zimnx zimnx commented Mar 5, 2025

Refactors ScyllaDBCluster controller to reconcile each datacenter independenly.
Errors or connection issues of down datacenter no longer affects reconcilation of other datacenter.

Requires:

Fixes #2494

Copy link
Contributor

@zimnx: GitHub didn't allow me to request PR reviews from the following users: zimnx.

Note that only scylladb members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

TODO

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@scylla-operator-bot scylla-operator-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 5, 2025
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zimnx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@scylla-operator-bot scylla-operator-bot bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 5, 2025
@zimnx zimnx force-pushed the independent-datacenters branch from b3235c0 to 433e060 Compare March 5, 2025 16:46
@scylla-operator-bot scylla-operator-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 5, 2025
@zimnx zimnx force-pushed the independent-datacenters branch 2 times, most recently from 057822f to 61cac51 Compare March 10, 2025 16:21
@scylla-operator-bot scylla-operator-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2025
@zimnx zimnx force-pushed the independent-datacenters branch from 61cac51 to 110646f Compare March 18, 2025 10:53
@scylla-operator-bot scylla-operator-bot bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 18, 2025
@zimnx zimnx force-pushed the independent-datacenters branch from 110646f to f8a89a1 Compare March 28, 2025 12:47
@zimnx zimnx force-pushed the independent-datacenters branch 2 times, most recently from d4b239d to 978ed02 Compare April 9, 2025 13:43
@zimnx zimnx changed the title [WIP] Independently reconciled datacenters Reconcile remote datacenters independently Apr 9, 2025
@scylla-operator-bot scylla-operator-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 9, 2025
@zimnx zimnx added kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Apr 9, 2025
@scylla-operator-bot scylla-operator-bot bot removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Apr 9, 2025
@zimnx zimnx changed the title Reconcile remote datacenters independently [WIP] Reconcile remote datacenters independently Apr 9, 2025
@scylla-operator-bot scylla-operator-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 9, 2025
@zimnx zimnx force-pushed the independent-datacenters branch 4 times, most recently from d651b70 to 9519be5 Compare April 14, 2025 16:42
@zimnx zimnx force-pushed the independent-datacenters branch 4 times, most recently from 3f3824c to 5487493 Compare April 16, 2025 13:39
zimnx added 5 commits April 16, 2025 17:19
…ster during E2E tests

Freamework CleanupInterface which had two responsiblities was split into Cleaner and Collect interfaces.
This allows to specify custom collectors for Cluster.

ScyllaDBClusters created during E2E tests now register a custom collector for each remote Cluster, collecting Namespaces originating from ScyllaDBCluster object in JustAfterEach step.
…KubernetesCluster healthcheck probes succeeds

Previously aggregated Available condition was set to True when there was no ClientHealthcheckControllerAvailable having False status.
When access to Kubernetes is broken and then restore, nothing removed this failed condition, hence RemoteKubernetesCluster was constantly not Available.
To fix that, healthcheck controller sets this condition according to state of given healthcheck iteration.
Refactors ScyllaDBCluster controller to reconcile each datacenter independenly.
Errors or connection issues of down datacenter no longer affects reconcilation of other datacenter.
…center

Existing Controller Conditions were changed to represent status of reconcilation of particular Datacenter.
On top of these, aggregated per Datacenter Conditions were added.
Test verifies if Operator is able to reconcile healthy datacenters when any is down, and whether it fully reconciles the cluster when down DC is restored.
@zimnx zimnx force-pushed the independent-datacenters branch from 5487493 to 15fbeb0 Compare April 16, 2025 19:06
Copy link
Contributor

@zimnx: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gke-multi-datacenter-parallel 15fbeb0 link true /test e2e-gke-multi-datacenter-parallel

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Multi-DC] Reconcile remote datacenters independently
1 participant