should fail authentication when KSA is not bound to IAM service account#1356
should fail authentication when KSA is not bound to IAM service account#1356tanuja-sunda73 wants to merge 3 commits into
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tanuja-sunda73 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @tanuja-sunda73. Thanks for your PR. I'm waiting for a GoogleCloudPlatform member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Code Review
This pull request introduces a new E2E test suite for Workload Identity Federation and updates the GCSFuseCSITestDriver to support WIF-specific IAM member formatting. Review feedback highlights the need to maintain the iamPropagationWaitTime constant rather than using hardcoded values, suggests using idiomatic Gomega functions like Eventually and Consistently for test polling, and recommends removing the unused addWorkloadIdentityBinding function.
| e2eframework.Logf("Waiting 5 minutes for Billing IAM policy propagation to prevent 403 flakiness...") | ||
| time.Sleep(5 * time.Minute) |
There was a problem hiding this comment.
The removal of the iamPropagationWaitTime constant and replacing it with a hardcoded 5 * time.Minute value reduces maintainability. Furthermore, the reduction from 10 minutes to 5 minutes is not explained and could lead to flakiness if IAM propagation takes longer than expected. It is better to retain the constant and use it here and in other similar locations (e.g., line 172).
| for i := 0; i < 60; i++ { | ||
| pod, err := f.ClientSet.CoreV1().Pods(f.Namespace.Name).Get(ctx, podName, metav1.GetOptions{}) | ||
| gomega.Expect(err).ToNot(gomega.HaveOccurred()) | ||
| for _, cs := range pod.Status.ContainerStatuses { | ||
| if cs.State.Waiting != nil && cs.State.Waiting.Reason == "CreateContainerError" { | ||
| ginkgo.By(fmt.Sprintf("CreateContainerError on %s: %s", cs.Name, cs.State.Waiting.Message)) | ||
| } | ||
| } | ||
|
|
||
| events, err := f.ClientSet.CoreV1().Events(f.Namespace.Name).List(ctx, | ||
| metav1.ListOptions{ | ||
| FieldSelector: fmt.Sprintf( | ||
| "involvedObject.name=%s", | ||
| podName, | ||
| ), | ||
| }, | ||
| ) | ||
|
|
||
| gomega.Expect(err).ToNot(gomega.HaveOccurred()) | ||
| lastEvents = events.Items | ||
| for _, e := range events.Items { | ||
|
|
||
| ginkgo.By(fmt.Sprintf("Event [%s]: %s", e.Reason, e.Message)) | ||
|
|
||
| if strings.Contains(e.Message, "PermissionDenied") && (strings.Contains(e.Message, "storage.objects.list") || strings.Contains(e.Message, "storageLayout") || strings.Contains(e.Message, "failed to get GCS bucket")) { | ||
| foundAuthFailure = true | ||
| foundWrongReason = false | ||
| ginkgo.By("Confirmed PermissionDenied auth failure") | ||
| break | ||
| } | ||
|
|
||
| if !foundAuthFailure && (strings.Contains(e.Message, "transport endpoint is not connected") || strings.Contains(e.Message, "failed to generate container") || strings.Contains(e.Message, "failed to stat")) { | ||
| foundWrongReason = true | ||
| } | ||
| } | ||
|
|
||
| if foundAuthFailure { | ||
| break | ||
| } | ||
| time.Sleep(5 * time.Second) | ||
| } |
| deadline := time.Now().Add(60 * time.Second) | ||
| for time.Now().Before(deadline) { | ||
| pod, err := f.ClientSet.CoreV1().Pods(f.Namespace.Name).Get(ctx, podName, metav1.GetOptions{}) | ||
| gomega.Expect(err).ToNot(gomega.HaveOccurred()) | ||
| if pod.Status.Phase == corev1.PodRunning { | ||
| podReachedRunning = true | ||
| ginkgo.By("ERROR: Pod unexpectedly reached Running") | ||
| break | ||
| } | ||
| if pod.Status.Phase == corev1.PodFailed { | ||
| ginkgo.By("Pod reached Failed phase as expected") | ||
| break | ||
| } | ||
| time.Sleep(3 * time.Second) | ||
| } |
There was a problem hiding this comment.
Use one of the polling functions rather than a sleep loop -- see examples elsewhere in this test suite.
|
Manual testing for this testcase SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS Ran 1 of 478 Specs in 71.026 seconds Ginkgo ran 1 suite in 1m17.051477382s On GKE:- SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS Ran 1 of 479 Specs in 71.576 seconds Ginkgo ran 1 suite in 1m19.448944168s |
|
/ok-to-test |
|
solved the conflicts and reran the tests E2E Test Suite [Driver: gcsfuse.csi.storage.gke.io] [Testpattern: CSI Ephemeral-volume (default fs)] workload-identity-federation should fail authentication when KSA is not bound to IAM service account STEP: Creating a kubernetes client @ 05/25/26 04:43:19.946 SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS Ran 1 of 481 Specs in 70.842 seconds Ginkgo ran 1 suite in 2m15.825898823s GKE:- SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS Ran 1 of 482 Specs in 75.581 seconds Ginkgo ran 1 suite in 3m29.564968948s |
b9c510d to
aec634d
Compare
|
Tests look good. You need to get the CLA worked out though. |
aec634d to
c58f397
Compare
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds a WIF E2E test that is - should fail authentication when KSA is not bound to IAM service account. Supports both OSS (external WIF via OIDC pool/provider) and GKE (native Workload Identity) clusters via IS_OSS.
Which issue(s) this PR fixes:
N/A