Skip to content

Conversation

@xiaoweim
Copy link
Collaborator

Change description

Fixes #

Tests you have done

  • Run make ready-pr to ensure this PR is ready for review.
  • Perform necessary E2E testing for changed resources.

@xiaoweim
Copy link
Collaborator Author

/assign @justinsb

@justinsb
Copy link
Collaborator

/approve
/lgtm

/hold for @yuwenma to approve

@xiaoweim
Copy link
Collaborator Author

/assign @yuwenma

@justinsb
Copy link
Collaborator

justinsb commented May 15, 2025

A few things @cheftako pointed out to me:

  1. We don't want a "gap" of versions, we should have at least each minor version in the window available here (c.f. https://github.com/GoogleCloudPlatform/k8s-config-connector/tree/2a5c056279d917232bc9deaa1f0ce02da025bb5f/operator/autopilot-channels/packages/configconnector ) Whether we want to support minor patch versions IDK, I would guess "any versions we shipped" is going to be the easiest policy

  2. We should be careful about whether downgrade is safe. You're not introducing this here so we can likely treat this as a separate issue. Walter was worried about ClusterRoles / Roles; I think I'm more worried about CRD removal or field removal (via downgrade). We're introducing more testing into the operator so I think we can deal with this there ( tests: add golden tests for operator #4528 )

@yuwenma
Copy link
Collaborator

yuwenma commented May 15, 2025

Could you share more context about this change? From the discussion, my read as we want to keep all the passed releases until the latest one in GKE add-on, which means the manifests contain 1.126.0, 1.127.0, 1.128.0, 1.129.2, 1.130.0, 1.131.0.

@xiaoweim xiaoweim force-pushed the restore_1.126.0_release branch from da6aaf1 to 1785b13 Compare May 15, 2025 20:08
@google-oss-prow google-oss-prow bot removed the lgtm label May 15, 2025
@xiaoweim
Copy link
Collaborator Author

I just updated the PR to not leave a gap in between older versions and restored all the shipped versions up until 1.126.0

Copy link
Collaborator

@jingyih jingyih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we could only add back the per-namespace-components.yaml files from the previous verions?

It looks like we are only using version from CCC to load per namespace components. [1]

This way it is more clear that the rollback does not include CRDs and ClusterRoels.

[1]

files, err := p.repo.LoadNamespacedComponents(ctx, componentName, version)

@xiaoweim
Copy link
Collaborator Author

I am wondering if we could only add back the per-namespace-components.yaml files from the previous verions?

It looks like we are only using version from CCC to load per namespace components. [1]

This way it is more clear that the rollback does not include CRDs and ClusterRoels.

[1]

files, err := p.repo.LoadNamespacedComponents(ctx, componentName, version)

I think that is a good idea. Let me update the PR!

@xiaoweim xiaoweim force-pushed the restore_1.126.0_release branch from 1785b13 to 9b9ec90 Compare May 20, 2025 20:28
Copy link
Collaborator

@jingyih jingyih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized we also have 1.129 and 1.130 folders that include everything. For consistency, we should either remove the CRDs and ClusterRoles from those folders or add them back for every version. Keeping only the per-namespace file clarifies what happens during a downgrade, but it may confuse users browsing the release folders as they'll see only one file. I’ll defer to @yuwenma on what we want to include in each version.

@yuwenma
Copy link
Collaborator

yuwenma commented May 22, 2025

My understanding is that we need to contain both 0-cnrm-system.yaml and per-namespace-components.yaml for each namespace version. Because user can choose a stale release as their very first installation in which case the namespace and RBAC will be missing. For example, user choose to use the 1.132 bundle to specifically install 1.126 (via CCC spec.version), there could be a chance that only the ./packages/1.126 are applied, in which case the 0-cnrm-system.yaml will be missing. Or does the CCC/CC specifically install this file in some other steps?

@yuwenma
Copy link
Collaborator

yuwenma commented May 22, 2025

/assign @cheftako

@xiaoweim
Copy link
Collaborator Author

My understanding is that we need to contain both 0-cnrm-system.yaml and per-namespace-components.yaml for each namespace version. Because user can choose a stale release as their very first installation in which case the namespace and RBAC will be missing. For example, user choose to use the 1.132 bundle to specifically install 1.126 (via CCC spec.version), there could be a chance that only the ./packages/1.126 are applied, in which case the 0-cnrm-system.yaml will be missing. Or does the CCC/CC specifically install this file in some other steps?

My understanding is that when we have specify the version field in the CCC, we load the version specific manifest but only looking for packages/${version}/namespaced/per-namespace-components.yaml https://github.com/GoogleCloudPlatform/k8s-config-connector/blob/master/operator/pkg/manifest/repo.go#L115.

I did not quite get where the user chooses to use 1.132.0 bundle to install 1.126.0. IIUC if the user install KCC 1.132 version first, and then update the CCC Version field to 1.126.0, their cluster should already have CRD and RBAC permissions installed and only loads the /per-namespace-components.yaml manifest.

@xiaoweim xiaoweim force-pushed the restore_1.126.0_release branch from 9b9ec90 to 2cfb35c Compare May 22, 2025 23:27
@xiaoweim
Copy link
Collaborator Author

I have just updated the PR to include the current CRD and RBAC into previous KCC release versions with old operators.

@yuwenma
Copy link
Collaborator

yuwenma commented May 27, 2025

/approve

@google-oss-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb, yuwenma

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yuwenma
Copy link
Collaborator

yuwenma commented May 27, 2025

One follow-up suggestion about fixing the propose-tag PR, non blocker to this change. Thanks and good job @xiaoweim

@xiaoweim
Copy link
Collaborator Author

One follow-up suggestion about fixing the propose-tag PR, non blocker to this change. Thanks and good job @xiaoweim

I have created #4646 to update the propose-tag script.

@cheftako
Copy link
Collaborator

/lgtm
/hold while Xiaowei checks the CRD files are consistent. (Github makes validating these difficult) Can you just confirm there are not diffs between the copies?

@google-oss-prow google-oss-prow bot added the lgtm label May 30, 2025
@cheftako
Copy link
Collaborator

cheftako commented Jun 2, 2025

/unhold after checking with Xiaowei

@cheftako
Copy link
Collaborator

cheftako commented Jun 2, 2025

/remove hold

@xiaoweim
Copy link
Collaborator Author

xiaoweim commented Jun 2, 2025

/unhold

@google-oss-prow google-oss-prow bot merged commit 3c3527b into GoogleCloudPlatform:master Jun 2, 2025
44 of 46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants