Skip to content

fix: specify which object failed in override error message#686

Merged
Yetkin Timocin (ytimocin) merged 2 commits into
kubefleet-dev:mainfrom
ytimocin:fix/override-error-specifies-failing-object
May 13, 2026
Merged

fix: specify which object failed in override error message#686
Yetkin Timocin (ytimocin) merged 2 commits into
kubefleet-dev:mainfrom
ytimocin:fix/override-error-specifies-failing-object

Conversation

@ytimocin

@ytimocin Yetkin Timocin (ytimocin) commented May 4, 2026

Copy link
Copy Markdown
Collaborator

Description of your changes

When an override snapshot fails to apply on a selected resource, the binding's Overridden=False condition message now identifies which override snapshot and which target object failed, instead of surfacing only the underlying patch error.

Before:

Failed to apply the override rules on the resources: failed to process the request due to a client error: add operation does not apply: doc is missing key: /invalid

After:

Failed to apply the override rules on the resources: ClusterResourceOverrideSnapshot "cro-2" failed to apply on Namespace "app": add operation does not apply: doc is missing key: /invalid

Implementation: the wrap is done once at the outer applyOverrides call site so the existing controller-level message trim at controller.go:188-194 works correctly without doubled sentinel prefixes. applyOverrideRules and applyJSONPatchOverride no longer call controller.NewUserError themselves — they return raw errors, and applyOverrides does the sentinel-tagging once with the snapshot + target context. errors.Is(err, controller.ErrUserError) still holds end-to-end.

Boy Scout cleanups limited to the files touched by this fix:

  • override.goisClusterScopeResourceisClusterScopedResource; capitalised log messages; cluster.ObjectMeta.Labelscluster.Labels; doc comments on three unexported functions; redundant // do nothing dropped.
  • override_test.gounmarshlunmarshal (3 sites); clusteroleclusterRole; corrected the wrong function name in a test failure diagnostic (TestReplaceClusterLabelKeyVariables was logging applyJSONPatchOverride()); expected/expectErrwant/wantErr per project style; test renamed TestApplyOverrides_namespacedScopeResource_namespaceScopedResource; explanatory comment added for the inverted-looking IsClusterScopedResource: false FakeManager flag.
  • controller_integration_test.go — double-space typos in By(...) strings.

I have:

  • Associated this change with a known KubeFleet Issue (Bug, Feature, etc).
  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

  • make test passes (pkg/controllers/workgenerator runs 86 Ginkgo specs plus unit tests; pkg/utils/condition and pkg/controllers/placement unaffected and green).
  • Unit coverage:
    • TestApplyOverrides_clusterScopedResource and TestApplyOverrides_namespaceScopedResource gained a wantErrSubstr field. Four failure cases now assert on the new message format (snapshot identity + target identity), including two cases that previously had no wantErrSubstr and would have surfaced an empty snapshot name post-change.
    • New case for the IsClusterMatched error path through applyOverrideRules (an invalid MatchExpressions Operator makes metav1.LabelSelectorAsSelector fail). This path was previously unexercised in override_test.go.
  • Integration coverage:
    • The existing Bound ClusterResourceBinding … invalid override context now asserts via MatchRegexp on the new message structure.
    • New sibling context Bound ClusterResourceBinding … invalid resource override mirrors the CRO failure path for ResourceOverrideSnapshot. New fixture invalidResourceOverrideSnapshot (ro-2) added in suite_test.go.

Special notes for your reviewer

  • Single-call wrap is intentional. applyOverrideRules and applyJSONPatchOverride no longer call controller.NewUserError. Re-wrapping an already-tagged error created a doubled "failed to process the request due to a client error: " prefix that the trim at controller.go:188-194 couldn't strip cleanly. Doing the wrap once at applyOverrides keeps the %w: <suffix> shape the trim expects.
  • Trim untouched. The pre-existing trim at controller.go:188-194 and its //TODO: check if it's user error and set a different failed reason are unchanged; cleaning that up is a separate refactor.
  • Scope discipline. Two follow-on improvements were considered and explicitly deferred to keep this PR focused:
    • Firing a Kubernetes Event on Overridden=False so kubectl describe binding shows the same enriched message.
    • Enriching the CRP/RP Overridden=False rollup condition message with the failing cluster names instead of just a count.

@codecov

codecov Bot commented May 4, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment thread pkg/controllers/workgenerator/override.go
Comment thread pkg/controllers/workgenerator/override.go Outdated
Signed-off-by: Yetkin Timocin <ytimocin@microsoft.com>
@ytimocin Yetkin Timocin (ytimocin) force-pushed the fix/override-error-specifies-failing-object branch from e827232 to b6d1622 Compare May 12, 2026 22:37
Yetkin Timocin (ytimocin) added a commit to ytimocin/kubefleet that referenced this pull request May 12, 2026
Address review feedback on PR kubefleet-dev#686:

- formatOverrideTarget: fall back to "Unknown" when the unstructured
  target has an empty Kind, so a malformed snapshot doesn't render as
  `"" "name"` in user-facing error messages.
- applyOverrideRules: drop the NewUnexpectedBehaviorError log wrapper on
  the IsClusterMatched failure path. The outer applyOverrides wrap
  already tags this as ErrUserError, so classifying the log line as
  "unexpected" contradicted the user-error classification.
- Add TestFormatOverrideTarget covering namespaced, cluster-scoped, and
  the new empty-Kind fallback (both scopes).
- Trim the verbose doc and test comments introduced by this PR.

Signed-off-by: Yetkin Timocin <ytimocin@microsoft.com>
Address review feedback on PR kubefleet-dev#686:

- formatOverrideTarget: fall back to "Unknown" when the unstructured
  target has an empty Kind, so a malformed snapshot doesn't render as
  `"" "name"` in user-facing error messages.
- applyOverrideRules: drop the NewUnexpectedBehaviorError log wrapper on
  the IsClusterMatched failure path. The outer applyOverrides wrap
  already tags this as ErrUserError, so classifying the log line as
  "unexpected" contradicted the user-error classification.
- applyOverrides: switch the outer wrap from the inline
  fmt.Errorf("%w: ...", controller.ErrUserError, ...) form to
  controller.NewUserError(fmt.Errorf(...)) to match the form used by
  every other call site in the codebase. Behaviorally identical
  (same Error() string, same errors.Is, same trim output at
  workgenerator/controller.go:188-194).
- Add TestFormatOverrideTarget covering namespaced, cluster-scoped, and
  the new empty-Kind fallback (both scopes).
- Trim the verbose doc and test comments introduced by this PR.

Signed-off-by: Yetkin Timocin <ytimocin@microsoft.com>
@ytimocin Yetkin Timocin (ytimocin) merged commit 63da14c into kubefleet-dev:main May 13, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants