Skip to content

Conversation

@pavansokkenagaraj
Copy link

@pavansokkenagaraj pavansokkenagaraj commented Mar 25, 2025

What type of PR is this?

/kind bug

What this PR does / why we need it:

CAPA EKS reconciler errors when an EKS cluster is deployed with:

  • endpointPrivateAccess: true
  • endpointPublicAccess: false

...and later, the AWSManagedControlPlane (AWSMCP) resource is updated to change publicCIDRs from a list of IPs to an empty list.

This results in a Reconciler error:

271] [capa-controller-manager-694c8f6879-wxg8q] 1 controller.go:326] "msg"="Reconciler error" "error"="failed to reconcile control plane for AWSManagedControlPlane cluster-67bdf4503897e994b608c9f3/alias-eks-privatetest-cp: failed reconciling cluster config: failed to update EKS cluster: InvalidParameterException: Cluster is already at the desired configuration with endpointPrivateAccess: true , endpointPublicAccess: false, and Public Endpoint Restrictions: [42.35.163.177/32, 34.23.247.65/32, 98.11.13.11/32, 52.6.49.73/32, 94.80.29.17/32, 13.52.68.26/32, 34.158.209.13/32, 34.22.106.120/32]\n{\n RespMetadata: {\n StatusCode: 400,\n RequestID: \"10163f97-d89b-44c1-bee1-75a3c476b980\"\n },\n ClusterName: \"alias-eks-privatetest\",\n Message_: \"Cluster is already at the desired configuration with endpointPrivateAccess: true , endpointPublicAccess: false, and Public Endpoint Restrictions: [42.35.163.177/32, 34.23.247.65/32, 98.11.13.11/32, 52.6.49.73/32, 94.80.29.17/32, 13.52.68.26/32, 34.158.209.13/32, 34.22.106.120/32]\"\n}" "AWSManagedControlPlane"={"name":"alias-eks-privatetest-cp","namespace":"cluster-67bdf4503897e994b608c9f3"} "controller"="awsmanagedcontrolplane" "controllerGroup"="controlplane.cluster.x-k8s.io" "controllerKind"="AWSManagedControlPlane" "name"="alias-eks-privatetest-cp" "namespace"="cluster-67bdf4503897e994b608c9f3" "reconcileID"="81216a1c-47f8-4059-b3b4-4f9664c3806f"

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes kubernetes-sigs#5441

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emoji in title
  • adds unit tests
  • adds or updates e2e tests

Release note:


mjlshen and others added 30 commits March 5, 2025 18:22
🌱 Fix test version string in order to use manifests from source files
📖 Clarify that the ROSA provider is currently for ROSA HCP clusters
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.33.0 to 0.36.0.
- [Commits](golang/net@v0.33.0...v0.36.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
…ot/go_modules/hack/tools/golang.org/x/net-0.36.0

🌱 Bump golang.org/x/net from 0.33.0 to 0.36.0 in /hack/tools
…wsmachines

✨ Add AWSMachines to back the EC2 instances in AWSMachinePools and AWSManagedMachinePools
…936-upstream

✨Add support for public-only networking
…ease28

🌱 chore: update metadata for v2.8.x release series
Sets paused condition on AWSMachine

Sets paused on AWSCluster

Sets paused condition on AWSManagedMachinePool

Sets paused condition for ROSAMachinePool

Sets paused condition for ROSAControlPlane

Sets paused condition on AWSManagedControlPlane

Sets paused condition on EKSConfig

Adds paused helper functions

This change adds the paused helper utilities from upstream cluster api.
It modifies them to not require v1beta2conditions.

This is so we can use similar code until the conditions changes are out
of beta.
…ndition

✨ Set Paused condition on reconciled resources status upon reconciliation being paused
Start with "unmanaged", or non-hosted control planes.

Other controllers that can be optional, such as the EKS, ROSA, and
MachinePool ones, are currently managed with feature flags. When they
graudate, they should be controlled by the `--disable-controllers` flag.
Updates AWSManagedCluster with Paused Condition

This change:

 - Updates the API for AWSManagedCluster to include a conditions field.
 - Sets `Paused` in the conditions if the controller is paused.

Updates ROSACluster with Paused Condition

This change:
  - Updates the API for ROSACluster to include a conditions field.
  - Sets `Paused` in the conditions if the controller is paused.

Updates generated API types
✨ Support for BoostrapSelfManagedAddons flag for EKS cluster creation
…dcluster-paused

✨ Updates AWSManagedCluster, ROSACluster with Paused Condition
While kubernetes-sigs#5394 and kubernetes-sigs#5383 added support for patching a cluster/status in the
cluster.x-k8s.io API group, neither added the patch permission for the
associated controllers.

This commit adds RBAC support for patching cluster/status

Signed-off-by: Nolan Brubaker <[email protected]>
🐛 Allow controllers to patch clusters/status
When ensuring the paused condition for AWSCluster we where accidently
passing in the CAPI Cluster as the object instead of the AWSCluster.

This caused a delay in reconciliation as the wrong object was being
patched. It also meant we added additional permission that we didn't
need.

Signed-off-by: Richard Case <[email protected]>
+ rename 2025-01-07-aws-self-managed-feature-gates.md to be consistent
  with the rest
…aused_fixed

🐛 fix: ensure patching correct object for paused
richardcase and others added 17 commits March 21, 2025 12:19
The EFS e2e test was breaking for 2 reasons:

1. Running out if disk space on the control plane nodes.
It only had 8Gb so this has been increased to 16gb
2.The workload being deployed to test EFS was using centos with has been
discontinued for a long time now. So changed to use Ubuntu

Also small updates to logging for the ELB test.

Signed-off-by: Richard Case <[email protected]>
AWSCluster was not reconciling when starting after an upgrade. It had
old logic to compare versions and not do anything. We want to reconcile
even if there are no changes to the AWSCluster as the ELB logic has
changed. Also, there may be other changes like this in future.

Change the SetupWithManager logic to be more like the standard we see
with other infrastructure providers.

Signed-off-by: Richard Case <[email protected]>
Signed-off-by: Nolan Brubaker <[email protected]>
…test

🐛 fix: efs & elb upgrade e2e tests
…b-image-go-1.23

🌱 cloudbuild: bump gcb image to get go 1.23
Signed-off-by: Nolan Brubaker <[email protected]>
Bumps [github.com/golang-jwt/jwt/v4](https://github.com/golang-jwt/jwt) from 4.5.1 to 4.5.2.
- [Release notes](https://github.com/golang-jwt/jwt/releases)
- [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md)
- [Commits](golang-jwt/jwt@v4.5.1...v4.5.2)

---
updated-dependencies:
- dependency-name: github.com/golang-jwt/jwt/v4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [github.com/golang-jwt/jwt/v4](https://github.com/golang-jwt/jwt) from 4.5.1 to 4.5.2.
- [Release notes](https://github.com/golang-jwt/jwt/releases)
- [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md)
- [Commits](golang-jwt/jwt@v4.5.1...v4.5.2)

---
updated-dependencies:
- dependency-name: github.com/golang-jwt/jwt/v4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
…-fixes

📖 Document latest release obstacles
…ot/go_modules/github.com/golang-jwt/jwt/v4-4.5.2

🌱 Bump github.com/golang-jwt/jwt/v4 from 4.5.1 to 4.5.2
…ot/go_modules/hack/tools/github.com/golang-jwt/jwt/v4-4.5.2

🌱 Bump github.com/golang-jwt/jwt/v4 from 4.5.1 to 4.5.2 in /hack/tools
Bumps [github.com/golang-jwt/jwt/v5](https://github.com/golang-jwt/jwt) from 5.2.1 to 5.2.2.
- [Release notes](https://github.com/golang-jwt/jwt/releases)
- [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md)
- [Commits](golang-jwt/jwt@v5.2.1...v5.2.2)

---
updated-dependencies:
- dependency-name: github.com/golang-jwt/jwt/v5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
…ot/go_modules/hack/tools/github.com/golang-jwt/jwt/v5-5.2.2

🌱 Bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 in /hack/tools
@spectro-prow
Copy link

@pavansokkenagaraj: The label(s) kind/bug cannot be applied, because the repository doesn't have them

Details

In response to this:

What type of PR is this?

/kind bug

What this PR does / why we need it:

CAPA EKS reconciler errors when an EKS cluster is deployed with:

  • endpointPrivateAccess: true
  • endpointPublicAccess: false

...and later, the AWSManagedControlPlane (AWSMCP) resource is updated to change publicCIDRs from a list of IPs to an empty list.

This results in a Reconciler error:

271] [capa-controller-manager-694c8f6879-wxg8q] 1 controller.go:326] "msg"="Reconciler error" "error"="failed to reconcile control plane for AWSManagedControlPlane cluster-67bdf4503897e994b608c9f3/alias-eks-privatetest-cp: failed reconciling cluster config: failed to update EKS cluster: InvalidParameterException: Cluster is already at the desired configuration with endpointPrivateAccess: true , endpointPublicAccess: false, and Public Endpoint Restrictions: [42.35.163.177/32, 34.23.247.65/32, 98.11.13.11/32, 52.6.49.73/32, 94.80.29.17/32, 13.52.68.26/32, 34.158.209.13/32, 34.22.106.120/32]\n{\n RespMetadata: {\n StatusCode: 400,\n RequestID: \"10163f97-d89b-44c1-bee1-75a3c476b980\"\n },\n ClusterName: \"alias-eks-privatetest\",\n Message_: \"Cluster is already at the desired configuration with endpointPrivateAccess: true , endpointPublicAccess: false, and Public Endpoint Restrictions: [42.35.163.177/32, 34.23.247.65/32, 98.11.13.11/32, 52.6.49.73/32, 94.80.29.17/32, 13.52.68.26/32, 34.158.209.13/32, 34.22.106.120/32]\"\n}" "AWSManagedControlPlane"={"name":"alias-eks-privatetest-cp","namespace":"cluster-67bdf4503897e994b608c9f3"} "controller"="awsmanagedcontrolplane" "controllerGroup"="controlplane.cluster.x-k8s.io" "controllerKind"="AWSManagedControlPlane" "name"="alias-eks-privatetest-cp" "namespace"="cluster-67bdf4503897e994b608c9f3" "reconcileID"="81216a1c-47f8-4059-b3b4-4f9664c3806f"

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes kubernetes-sigs#5441

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emoji in title
  • adds unit tests
  • adds or updates e2e tests

Release note:


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@spectro-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pavansokkenagaraj
To complete the pull request process, please assign after the PR has been reviewed.
You can assign the PR to them by writing /assign in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reconciler error when updating AWSMCP publicCIDRs to empty list with endpointPrivateAccess: true and endpointPublicAccess: false