Skip to content

feat: implement deployment-style rollout for replicant sets#1184

Merged
keynslug merged 13 commits intomain-3.xfrom
wip/EMQX-14820/replicants-deployment
Apr 15, 2026
Merged

feat: implement deployment-style rollout for replicant sets#1184
keynslug merged 13 commits intomain-3.xfrom
wip/EMQX-14820/replicants-deployment

Conversation

@keynslug
Copy link
Copy Markdown
Contributor

@keynslug keynslug commented Apr 15, 2026

Summary

This PR reworks how Operator manages set of EMQX replicant nodes.

  1. Replicants are managed in Deployment-like pattern.
  2. Scaling up and down is now managed in-place, existing "update" replicant set is scaled accordingly.
  3. Rolling update cadence is driven by maxUnavailable / maxSurge parameters.
  4. Additionally, scale down cadence is affected by the maxUnavailable parameter.
  5. No strict delineation between "update" and "current" replicants anymore, both serve "listeners" service.
  6. Changed API and operational model should now be friendlier for HPAs.

Followup to #1179, includes minor related changes.

Important notes

  1. There's no backward compatibility measures
    However it should not have as dramatic effect as feat: switch core StatefulSet to in-place rolling updates #1179 does.

keynslug added 13 commits April 15, 2026 14:28
This commit ensures that the very first node evacuation under rolling replicants
update has non-empty list of migration targets, e.g. if `MaxUnavailable` is 1
and `MaxSurge` is 0, which means first "current" replicant undergoes scaling down
when no "update" replicants are up yet.
* Scaling up sets RS replicas to desired count directly.
* Scaling down processes `MaxUnavailable` replicants at a time with
  usual admission checking (evacuation gating, session drain, DS
  replication site) before scheduling pods for deletion: attaching
  `pod-deletion-cost` and decrementing RS replica count.
This change ensures that replicants that are already in the process of scaling
down bypass unavailability budget checks, so that their deletion will eventually
complete.
This commit makes all replicants eligible to serve "listeners" managed
service, not only "update" set replicants, in line with gradual rolling
update strategy for replicants.
@keynslug keynslug changed the title Wip/emqx 14820/replicants deployment feat: implement deployment-style rollout for replicant sets Apr 15, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 82.52033% with 86 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.36%. Comparing base (1f32248) to head (523175f).
⚠️ Report is 15 commits behind head on main-3.x.

Files with missing lines Patch % Lines
internal/controller/sync_replicant_sets.go 80.44% 45 Missing and 16 partials ⚠️
api/v3alpha1/emqx_types_spec.go 57.14% 8 Missing and 4 partials ⚠️
internal/controller/sync_core_set.go 66.66% 4 Missing and 3 partials ⚠️
internal/controller/load_state.go 92.72% 2 Missing and 2 partials ⚠️
internal/controller/util.go 92.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           main-3.x    #1184      +/-   ##
============================================
+ Coverage     74.17%   75.36%   +1.18%     
============================================
  Files            48       48              
  Lines          3667     4006     +339     
============================================
+ Hits           2720     3019     +299     
- Misses          801      832      +31     
- Partials        146      155       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@keynslug keynslug marked this pull request as ready for review April 15, 2026 12:59
@keynslug keynslug merged commit 951db30 into main-3.x Apr 15, 2026
14 checks passed
@keynslug keynslug deleted the wip/EMQX-14820/replicants-deployment branch April 15, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants