Add e2e tests for WIF IAM authorization failures on OSS and GKE clusters#1333
Add e2e tests for WIF IAM authorization failures on OSS and GKE clusters#1333su-sudhir wants to merge 8 commits into
Conversation
|
@su-sudhir: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: su-sudhir The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @su-sudhir. Thanks for your PR. I'm waiting for a GoogleCloudPlatform member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Code Review
This pull request introduces a new E2E test suite for Workload Identity Federation (WIF) and refactors IAM policy handling to support both GKE and OSS environments. The review identifies critical compilation issues due to missing helper functions and constants. It also recommends refactoring duplicated logic for project ID extraction and service account naming, using shared constants for WIF pool IDs, and replacing fixed sleeps with polling for IAM propagation. Additionally, the reviewer suggests using the GCS Go client library instead of the gcloud CLI for bucket operations to improve portability and error handling.
| setupOSSWIFPrincipal := func(ksaName, poolID, providerID, configMapName string) string { | ||
| projectID := os.Getenv(utils.ProjectEnvVar) | ||
| gomega.Expect(projectID).NotTo(gomega.BeEmpty(), fmt.Sprintf("%s environment variable must be set", utils.ProjectEnvVar)) | ||
|
|
||
| ginkgo.By("Getting GCP project number") | ||
| projectNumber := getProjectNumber(projectID) | ||
| gomega.Expect(projectNumber).NotTo(gomega.BeEmpty(), "failed to get project number") | ||
|
|
||
| ginkgo.By(fmt.Sprintf("Creating workload identity pool: %s", poolID)) | ||
| createWorkloadIdentityPool(projectID, poolID) | ||
|
|
||
| ginkgo.By("Discovering cluster OIDC issuer from cluster service account token") | ||
| clusterIssuer := getOSSClusterOIDCIssuer(ctx, f) | ||
| gomega.Expect(clusterIssuer).NotTo(gomega.BeEmpty(), "failed to discover cluster OIDC issuer") | ||
|
|
||
| ginkgo.By(fmt.Sprintf("Creating workload identity provider: %s", providerID)) | ||
| createWorkloadIdentityProvider(projectID, poolID, providerID, clusterIssuer) | ||
|
|
||
| ginkgo.By("Generating credential configuration") | ||
| credentialConfig := generateCredentialConfig(projectNumber, poolID, providerID) | ||
|
|
||
| ginkgo.By(fmt.Sprintf("Creating Kubernetes service account: %s", ksaName)) | ||
| createServiceAccount(ctx, f, ksaName) | ||
| ginkgo.DeferCleanup(func() { deleteServiceAccount(ctx, f, ksaName) }) | ||
|
|
||
| ginkgo.By(fmt.Sprintf("Creating credential ConfigMap: %s", configMapName)) | ||
| createCredentialConfigMap(ctx, f, configMapName, credentialConfig) | ||
| ginkgo.DeferCleanup(func() { deleteConfigMap(ctx, f, configMapName) }) | ||
|
|
||
| return fmt.Sprintf( | ||
| "principal://iam.googleapis.com/projects/%s/locations/global/workloadIdentityPools/%s/subject/system:serviceaccount:%s:%s", | ||
| projectNumber, poolID, f.Namespace.Name, ksaName, | ||
| ) | ||
| } |
There was a problem hiding this comment.
The setupOSSWIFPrincipal function relies on several helper functions that are not defined in this file or imported from other packages, which will cause compilation errors. Specifically, the following functions are missing: getProjectNumber, createWorkloadIdentityPool, createWorkloadIdentityProvider, generateCredentialConfig, createServiceAccount, deleteServiceAccount, createCredentialConfigMap, and deleteConfigMap. Additionally, grantBucketAccess and revokeBucketAccess (used in the It blocks) are also missing from the PR.
There was a problem hiding this comment.
All the listed functions (getProjectNumber, createWorkloadIdentityPool, createWorkloadIdentityProvider, generateCredentialConfig, createServiceAccount, deleteServiceAccount, createCredentialConfigMap, deleteConfigMap, grantBucketAccess, revokeBucketAccess) are defined in gcsfuse_oidc_auth.go, which is in the same testsuites package. Since Go resolves symbols across all files in a package, these are fully accessible without any additional imports.
| if out, err := exec.Command("gcloud", "storage", "buckets", "create", "gs://"+altBucket, "--project="+projectID).CombinedOutput(); err != nil { | ||
| klog.Warningf("Failed to create alternate bucket %s: %v, output: %s", altBucket, err, string(out)) | ||
| } |
There was a problem hiding this comment.
Using exec.Command("gcloud", ...) to manage GCS buckets is discouraged in E2E tests as it introduces a dependency on the gcloud CLI and environment configuration. It is better to use the GCS Go client library (via the storageService abstraction) for better portability and consistency. Furthermore, if bucket creation fails, the test should fail immediately using framework.Failf instead of just logging a warning, as subsequent steps will likely fail with confusing errors.
| if out, err := exec.Command("gcloud", "storage", "buckets", "create", "gs://"+altBucket, "--project="+projectID).CombinedOutput(); err != nil { | |
| klog.Warningf("Failed to create alternate bucket %s: %v, output: %s", altBucket, err, string(out)) | |
| } | |
| if out, err := exec.Command("gcloud", "storage", "buckets", "create", "gs://"+altBucket, "--project="+projectID).CombinedOutput(); err != nil { | |
| framework.Failf("Failed to create alternate bucket %s: %v, output: %s", altBucket, err, string(out)) | |
| } |
There was a problem hiding this comment.
Already addressed in a prior commit — the alternate bucket creation now uses the GCS Go client (gostorage.NewClient) and framework.Failf on failure, exactly as suggested.
| lines := strings.Split(strings.TrimSpace(rawProjectID), "\n") | ||
| projectID := lines[len(lines)-1] | ||
|
|
||
| gomega.Expect(strings.Contains(projectID, "Your active configuration")).To(gomega.BeFalse(), | ||
| fmt.Sprintf("invalid projectID detected: %q", projectID)) |
There was a problem hiding this comment.
The logic to extract the project ID while handling potential Cloud Shell warnings is duplicated here and on lines 306-307. This should be refactored into a shared helper function to improve maintainability. Per repository guidelines, if the logic relies heavily on methods within a specific package, extract it to a file within that same package to minimize new dependencies.
References
- Avoid duplicating code by extracting common logic into helper functions to improve maintainability and reduce redundancy.
- When refactoring common test setup logic, place the generic function in a location that minimizes new dependencies. If the logic relies heavily on methods within a specific package (e.g., testsuites), extract it to a file within that same package.
| } | ||
| if os.Getenv(utils.IsOSSEnvVar) == "true" { | ||
| projectNumber := os.Getenv(utils.ProjectNumberEnvVar) | ||
| return fmt.Sprintf("principal://iam.googleapis.com/projects/%s/locations/global/workloadIdentityPools/gcs-fuse-oidc-pool/subject/system:serviceaccount:%s:%s", |
There was a problem hiding this comment.
The WIF pool ID gcs-fuse-oidc-pool is hardcoded here and also defined as a constant in workload_identity_federation.go. To ensure consistency and ease of maintenance, this value should be defined as a single shared constant in a utility package.
References
- Avoid duplicating code by extracting common logic into helper functions to improve maintainability and reduce redundancy.
There was a problem hiding this comment.
Fixed — removed wifWorkloadIdentityPoolID and wifWorkloadIdentityProviderID from workload_identity_federation.go and replaced all usages with oidcWorkloadIdentityPoolID and oidcWorkloadIdentityProviderID already defined in gcsfuse_oidc_auth.go (same package).
|
/ok-to-test |
|
/retest |
|
Run tests manually On OSS On GKE |
|
/assign @mattcary |
… helper, and deduplicate pool ID constant
…itizeProjectID, return credentialConfig from setupOSSWIFPrincipal
…onstants from same package
2c7ca30 to
9887e3d
Compare
|
Resolved the merge conflicts and tested it Logs |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds three negative e2e test cases to the
workload-identity-federationtest suite to validate IAM-level authorization failures for Workload
Identity Federation on both OSS (self-managed) and GKE clusters:
No storage role: A WIF principal with no IAM role on the bucket.
When the sidecar bucket access check is enabled, the mount is denied
at setup time with a
PermissionDeniederror. Without the check, thepod runs but gcsfuse logs a permission denied error on first GCS access.
Read-only role, write fails: A WIF principal granted
roles/storage.objectVieweron the bucket. Read operations (e.g.ls)succeed; write operations are rejected by GCS with a permission error.
Role on a different bucket: A WIF principal granted
roles/storage.objectUseron an alternate bucket, not the one beingmounted. GCS denies access to the mounted bucket.
All three tests run on both OSS clusters (via external WIF pool/provider
and credential ConfigMap) and GKE clusters (via native Workload Identity
with GSA impersonation), selected at runtime via the
IS_OSSenvironmentvariable.
Which issue(s) this PR fixes:
N/A
Does this PR introduce a user-facing change?: