Skip to content

[main] Add support for a FieldSelector in the ResourceSet#742

Merged
mallardduck merged 3 commits intorancher:mainfrom
thatmidwesterncoder:add_field_selector_support
Apr 1, 2025
Merged

[main] Add support for a FieldSelector in the ResourceSet#742
mallardduck merged 3 commits intorancher:mainfrom
thatmidwesterncoder:add_field_selector_support

Conversation

@thatmidwesterncoder
Copy link
Contributor

Issue: rancher/rancher#48558

Problem:

Certain very important secrets are not being backed up by the backup-restore operator when the cluster-name + pool-name gets to be longer than 63 characters, this is because they fail to match the v2prov regex.

Solution:

Add support for selecting resources via fieldSelector in order to be able to effectively filter on specific secret types that we care about. This will be useful for other "custom" secret types as well as being able to narrow things that have a certain label + fieldSelector.

The solution is implemented by just tacking on a fieldSelector to the initial API listing if it is present. By default it selects everything mimicking old behavior (exactly like the label selector api).

Alternatives:

  • I did go down the route of limiting the secret name length here but that seems to introduce a lot of regression risk as well as feeling really hacky
  • Changing rancher to use annotations/labels etc rather than just string names, but once again this would have high regression risk since we would effectively be completely changing how rancher looks up secrets pertaining to a cluster/pool for machine-state/machine-plan/cluster-state secrets

Testing:

I added e2e integration tests for this which fetch the tarball and extract it into memory - and then validate that the custom-resource-set was used and backed up a single secret (and not the other one)

@thatmidwesterncoder thatmidwesterncoder requested a review from a team as a code owner March 24, 2025 19:29
@snasovich snasovich requested review from a team and jakefhyde March 25, 2025 21:11

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a problem for this PR, but what do we do about clusters in a different namespace than fleet-default? cc @rancher/observation-backup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah great question - those weren't being backed up before I would assume as well right? at least thats how this current ResourceSet reads.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed a problem and there's an open issue for it here: #574

A workaround is to use the "resources.cattle.io/backup=true" label to make sure the secret is included in the backup.

@thatmidwesterncoder thatmidwesterncoder force-pushed the add_field_selector_support branch from 4fee098 to 5fd4256 Compare March 27, 2025 16:55
jakefhyde
jakefhyde previously approved these changes Mar 27, 2025
@thatmidwesterncoder thatmidwesterncoder force-pushed the add_field_selector_support branch from 5fd4256 to 3ea4287 Compare March 27, 2025 18:51
@mallardduck mallardduck requested a review from jbiers March 28, 2025 13:23
@mallardduck
Copy link
Member

mallardduck commented Mar 28, 2025

Looks like @jakefhyde gave it an approval so I've assigned @jbiers for O&B side review before merge. Jake / @thatmidwesterncoder , is this something that is expected to backport to 2.11 (or other versions)? If yes, is it considered a critical issue that should be targeting 2.11.1? Forget if we covered that some where else but just want to verify the scope of this for backport considerations.

(Essentially, because BRO hasn't branched after 2.11.0 release currently - if we want it in 2.11.1 we should merge first and then branch. If we only want this in 2.12.0 then we should branch first then merge this.)

@thatmidwesterncoder
Copy link
Contributor Author

@mallardduck yes - we want it for 2.11.1 so we'll merge -> branch, we may want it in the 2.10 release line as well so I'll open a backport PR there after it gets merged.

Copy link
Contributor

@jbiers jbiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nitpick just to keep consistency in the codebase, once fix will be ready for approval

Copy link
Contributor

@jbiers jbiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

It's best if we also get this issue tested by QA on our side as well. Should we have a separate issue or can I add rancher/rancher#48558 to my team's dashboard too? @thatmidwesterncoder

@mallardduck
Copy link
Member

I think we may need 2 issues to track as with one, only one of the QA teams can mark it as done. I believe when the first team does that it would (likely) be hidden for the following team. Since their filters probably auto mark things as done if the issue is closed too.

@thatmidwesterncoder
Copy link
Contributor Author

That sounds accurate. I assume our QAs would like to close this out too e.g. validating that the provisioning cluster objects are all present after restore.

@jbiers
Copy link
Contributor

jbiers commented Mar 31, 2025

Created rancher/rancher#49647 to track the issue on our side

@mallardduck mallardduck merged commit 6575667 into rancher:main Apr 1, 2025
7 checks passed
@mallardduck
Copy link
Member

mallardduck commented Apr 1, 2025

Hopefully didn't pull the gun too early @thatmidwesterncoder / @jakefhyde ...this came up in our team sync up and with 2.11.1 looming we want to get this to QA ASAP. So I'm also preparing an RC and then making rancher/charts PR. Once merged into charts and ready for testing I'll update each respective PR with notes on what version to test with.

PS I also did the branching thing...so this change will be ready for 2.11.1 and 2.12.0; so we may need to consider back ports for 2.10 and 2.9 (maybe) now that this is moving forward.

@thatmidwesterncoder thatmidwesterncoder deleted the add_field_selector_support branch April 1, 2025 18:13
thatmidwesterncoder added a commit to thatmidwesterncoder/backup-restore-operator that referenced this pull request Apr 1, 2025
* add support for a FieldSelector in the ResourceSet

* add cluster-state, machine-state, machine-plan to v2prov resource sets

* add integration test for custom resource set including field selectors
mallardduck pushed a commit that referenced this pull request Apr 2, 2025
* add support for a FieldSelector in the ResourceSet

* add cluster-state, machine-state, machine-plan to v2prov resource sets

* add integration test for custom resource set including field selectors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants