Skip to content

forwardport - Fix drift correction regression when correctDrift.force is false (#4879)#4939

Merged
thardeck merged 1 commit into
rancher:mainfrom
0xavi0:main-forwartport-drift-correction
Apr 14, 2026
Merged

forwardport - Fix drift correction regression when correctDrift.force is false (#4879)#4939
thardeck merged 1 commit into
rancher:mainfrom
0xavi0:main-forwartport-drift-correction

Conversation

@0xavi0
Copy link
Copy Markdown
Contributor

@0xavi0 0xavi0 commented Apr 7, 2026

The Helm v4 migration set ServerSideApply to "auto" for non-force rollbacks. This caused Helm to auto-detect and reuse the apply method from the original release, which is SSA. Server-Side Apply tracks field ownership per field manager, so rollback only reverts fields owned by Helm's manager — manual changes owned by a different manager are silently ignored.

Setting ServerSideApply to "false" for all rollbacks forces client-side three-way merge, which compares the full resource state and patches all drifted fields regardless of ownership. This restores the Helm v3 behavior.

Refers to: #4938
Forwardport #4879


Additional Information

Checklist

- [ ] I have updated the documentation via a pull request in the fleet-product-docs repository.

…cher#4879)

* Fix drift correction regression when correctDrift.force is false

The Helm v4 migration set ServerSideApply to "auto" for non-force
rollbacks. This caused Helm to auto-detect and reuse the apply method
from the original release, which is SSA. Server-Side Apply tracks field
ownership per field manager, so rollback only reverts fields owned by
Helm's manager — manual changes owned by a different manager are
silently ignored.

Setting ServerSideApply to "false" for all rollbacks forces client-side
three-way merge, which compares the full resource state and patches all
drifted fields regardless of ownership. This restores the Helm v3
behavior.

Refers to: rancher#4878

---------

Signed-off-by: Xavi Garcia <xavi.garcia@suse.com>
@0xavi0 0xavi0 added this to the v2.15.0 milestone Apr 7, 2026
@0xavi0 0xavi0 self-assigned this Apr 7, 2026
@0xavi0 0xavi0 added this to Fleet Apr 7, 2026
@0xavi0 0xavi0 added the kind/bug label Apr 7, 2026
@0xavi0 0xavi0 moved this to 👀 In review in Fleet Apr 7, 2026
@0xavi0 0xavi0 marked this pull request as ready for review April 7, 2026 08:42
@0xavi0 0xavi0 requested a review from a team as a code owner April 7, 2026 08:42
Copilot AI review requested due to automatic review settings April 7, 2026 08:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a drift-correction regression introduced during the Helm v4 migration by forcing rollbacks to always use client-side three-way merge (disabling Server-Side Apply), ensuring rollback reverts drifted fields regardless of SSA field ownership.

Changes:

  • Always set Helm rollback ServerSideApply to "false" so non-force drift correction behaves like Helm v3 (no SSA field-ownership limitation).
  • Expand/adjust drift-correction integration tests to cover additional drift scenarios (labels, replicas, external deletions, keepFailHistory, multi-resource drift, paused/off-schedule behavior, and a documented Helm limitation).
  • Update test assets to explicitly set Service type: ClusterIP to support immutable-field drift scenarios.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
internal/helmdeployer/rollback.go Forces client-side apply for all rollbacks to restore drift correction behavior when correctDrift.force is false.
integrationtests/agent/bundle_deployment_drift_test.go Adds/updates integration coverage for multiple drift scenarios and rollback-history behavior.
integrationtests/agent/assets/deployment-v1.yaml Sets Service type explicitly for drift tests involving Service type changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +326 to +327
err = k8sClient.Get(ctx, nsn, &cm)
Expect(apierrors.IsNotFound(err)).To(BeTrue())
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test assumes the resource is gone immediately after Delete(), but Kubernetes deletion is asynchronous (and may be delayed by finalizers). This can make the NotFound assertion flaky; consider wrapping the Get/IsNotFound check in Eventually() to wait for actual removal.

Suggested change
err = k8sClient.Get(ctx, nsn, &cm)
Expect(apierrors.IsNotFound(err)).To(BeTrue())
Eventually(func(g Gomega) {
err := k8sClient.Get(ctx, nsn, &cm)
g.Expect(apierrors.IsNotFound(err)).To(BeTrue())
}).Should(Succeed())

Copilot uses AI. Check for mistakes.
Comment on lines +360 to +361
err = k8sClient.Get(ctx, nsn, &corev1.Service{})
Expect(apierrors.IsNotFound(err)).To(BeTrue())
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test assumes the resource is gone immediately after Delete(), but Kubernetes deletion is asynchronous (and may be delayed by finalizers). This can make the NotFound assertion flaky; consider wrapping the Get/IsNotFound check in Eventually() to wait for actual removal.

Suggested change
err = k8sClient.Get(ctx, nsn, &corev1.Service{})
Expect(apierrors.IsNotFound(err)).To(BeTrue())
Eventually(func() error {
err := k8sClient.Get(ctx, nsn, &corev1.Service{})
if apierrors.IsNotFound(err) {
return nil
}
if err != nil {
return err
}
return fmt.Errorf("service %s still exists", nsn.String())
}, time.Minute, time.Second).Should(Succeed())

Copilot uses AI. Check for mistakes.
Comment on lines +467 to +468
err = k8sClient.Get(ctx, nsn, &cm)
Expect(apierrors.IsNotFound(err)).To(BeTrue())
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test assumes the resource is gone immediately after Delete(), but Kubernetes deletion is asynchronous (and may be delayed by finalizers). This can make the NotFound assertion flaky; consider wrapping the Get/IsNotFound check in Eventually() to wait for actual removal.

Suggested change
err = k8sClient.Get(ctx, nsn, &cm)
Expect(apierrors.IsNotFound(err)).To(BeTrue())
Eventually(func() bool {
err := k8sClient.Get(ctx, nsn, &cm)
return apierrors.IsNotFound(err)
}).Should(BeTrue())

Copilot uses AI. Check for mistakes.
@thardeck thardeck merged commit 11f8cbe into rancher:main Apr 14, 2026
26 checks passed
@github-project-automation github-project-automation Bot moved this from 👀 In review to ✅ Done in Fleet Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants