🌟 Implement WIF for Container Registry Scanning#1471
Conversation
- Add tests for GKE, EKS, and AKS Workload Identity configurations in `resources_test.go`. - Update AKS and EKS manifests to include Workload Identity configurations. - Modify Terraform scripts to grant necessary permissions for Workload Identity. - Create new manifests for private workload deployments and associated scripts. - Implement verification scripts for WIF container registry scanning. - Enhance existing scripts to support private image deployments and verifications.
…iner image scanning
@check-spelling-bot Report🔴 Please reviewSee the 📂 files view, the 📜action log, or 📝 job summary for details.Unrecognized words (3)acr These words are not needed and should be removedACRTo accept these unrecognized words as correct and remove the previously acknowledged and now absent words, you could run the following commands... in a clone of the git@github.com:mondoohq/mondoo-operator.git repository curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/c635c2f3f714eec2fcf27b643a1919b9a811ef2e/apply.pl' |
perl - 'https://github.com/mondoohq/mondoo-operator/actions/runs/24466961745/attempts/1' &&
git commit -m 'Update check-spelling metadata'Available 📚 dictionaries could cover words (expected and unrecognized) not in the 📘 dictionaryThis includes both expected items (35) from .github/actions/spelling/expect.txt and unrecognized words (3)
Consider adding them (in with:
extra_dictionaries: |
cspell:python/src/python/python-lib.txt
cspell:elixir/dict/elixir.txt
cspell:node/dict/node.txt
cspell:php/dict/php.txt
cspell:java/src/java.txtTo stop checking additional dictionaries, add (in check_extra_dictionaries: ""Warnings
|
| Count | |
|---|---|
| 1 |
See
If the flagged items are false positives
If items relate to a ...
-
binary file (or some other file you wouldn't want to check at all).
Please add a file path to the
excludes.txtfile matching the containing file.File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.
^refers to the file's path from the root of the repository, so^README\.md$would exclude README.md (on whichever branch you're using). -
well-formed pattern.
If you can write a pattern that would match it,
try adding it to thepatterns.txtfile.Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.
Note that patterns can't match multiline strings.
There was a problem hiding this comment.
Container registry WIF scanning forces users to provide unused cluster-specific fields and may fail at runtime due to missing field validation.
Additional findings (file/line not in diff):
- 🟡
api/v1alpha2/mondooauditconfig_types.go:298— ReusingWorkloadIdentityConfig(and its sub-structs likeAKSWorkloadIdentity) for container registry WIF forces users to provide CRD-required fields that are irrelevant for registry auth (clusterName,resourceGroup,subscriptionIdfor AKS;clusterName,clusterLocationfor GKE;clusterNamefor EKS). The CRD validation will reject a CR that omits those fields even though they're unused.
Consider either:
- Making the cluster-specific fields optional (changing
+kubebuilder:validation:Requiredto+optionaland addingomitempty) — this is the minimal change but weakens validation for the k8s-scan WIF path, or - Creating separate, slimmer config types for container registry WIF that only contain the fields actually needed (e.g.,
ContainerRegistryAKSConfigwith onlyclientId,tenantId,loginServer).
Option 2 is cleaner long-term. At minimum, the docs/user-manual examples should show the required dummy values so users aren't surprised by validation errors.
| func validateContainerRegistryWIF(wif *v1alpha2.WorkloadIdentityConfig) error { | ||
| if wif == nil { | ||
| return nil | ||
| } | ||
|
|
||
| switch wif.Provider { | ||
| case v1alpha2.CloudProviderGKE: | ||
| if wif.GKE == nil { | ||
| return fmt.Errorf("containers.workloadIdentity: gke config required when provider is gke") | ||
| } | ||
| case v1alpha2.CloudProviderEKS: | ||
| if wif.EKS == nil { | ||
| return fmt.Errorf("containers.workloadIdentity: eks config required when provider is eks") | ||
| } | ||
| case v1alpha2.CloudProviderAKS: | ||
| if wif.AKS == nil { | ||
| return fmt.Errorf("containers.workloadIdentity: aks config required when provider is aks") | ||
| } | ||
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
🟡 warning — validateContainerRegistryWIF only checks that the provider-specific struct is non-nil, but does not validate the fields actually needed for registry auth. For example, an AKS config with an empty LoginServer, or an EKS config with an empty Region, will pass validation but the init container will fail at runtime with a confusing error.
Add field-level checks for registry-specific fields:
case v1alpha2.CloudProviderAKS:
if wif.AKS == nil {
return fmt.Errorf("...")
}
if wif.AKS.LoginServer == "" {
return fmt.Errorf("containers.workloadIdentity: aks.loginServer is required for container registry WIF")
}
case v1alpha2.CloudProviderEKS:
if wif.EKS == nil {
return fmt.Errorf("...")
}
if wif.EKS.Region == "" {
return fmt.Errorf("containers.workloadIdentity: eks.region is required for container registry WIF")
}
case v1alpha2.CloudProviderGKE:
if wif.GKE == nil {
return fmt.Errorf("...")
}
if wif.GKE.GoogleServiceAccount == "" {
return fmt.Errorf("containers.workloadIdentity: gke.googleServiceAccount is required for container registry WIF")
}|
|
||
| // Add init container for registry credential generation | ||
| podSpec.InitContainers = append(podSpec.InitContainers, registryWIFInitContainer(wif)) | ||
|
|
There was a problem hiding this comment.
🟡 warning — For AKS, the azure.workload.identity/use label is added to ls (the label map), but ls was defined earlier and is used for both the CronJob metadata labels and the pod template labels. Mutating it here also changes the CronJob's own labels, not just the pod template labels. This is likely harmless but unintentional — the AKS WIF webhook only needs this label on the pod, not the CronJob object itself.
Consider setting it directly on the pod template labels instead:
podSpec.Labels["azure.workload.identity/use"] = "true"Or make a copy of ls for the pod template before mutating.
| workloadIdentity: | ||
| provider: gke | ||
| gke: | ||
| projectId: my-gcp-project | ||
| googleServiceAccount: scanner@my-gcp-project.iam.gserviceaccount.com | ||
| ``` | ||
|
|
There was a problem hiding this comment.
🔵 suggestion — The GKE example omits the clusterName and clusterLocation fields that are marked as required in the CRD schema. Users following this example will get a validation error when applying the CR. Either add placeholder values with a comment explaining they're unused, or fix the CRD to make those fields optional (see other finding).
| containers: | ||
| enable: true | ||
| workloadIdentity: | ||
| provider: eks | ||
| eks: | ||
| region: us-west-2 | ||
| roleArn: arn:aws:iam::123456789012:role/MondooRegistryReader |
There was a problem hiding this comment.
🔵 suggestion — Same issue as the GKE example: the EKS example omits the required clusterName field. Users will get a CRD validation error.
| ```yaml | ||
| spec: | ||
| containers: | ||
| enable: true | ||
| workloadIdentity: | ||
| provider: aks | ||
| aks: | ||
| clientId: abcdef12-3456-7890-abcd-ef1234567890 |
There was a problem hiding this comment.
🔵 suggestion — Same issue: the AKS example omits required fields clusterName, resourceGroup, and subscriptionId. Users will get a CRD validation error.
Summary
Adds Workload Identity Federation (WIF) support for container image scanning. The operator can now authenticate to private container registries (GCR, Artifact Registry, ECR, ACR) using cloud-native WIF credentials instead of static pull secrets.
How it works
When
containers.workloadIdentityis configured, the operator:Cloud-specific init containers
gcloud auth print-access-tokento generate auth for all Artifact Registry regional endpoints and GCR hostsaws ecr get-login-passwordto authenticate against ECRaz acr login --expose-tokento authenticate against ACRRBAC
The WIF ServiceAccount needs the same cluster read permissions as the default scanner SA. The Helm chart accepts
k8SResourcesScanning.extraClusterRoleBindingSubjectsto bind additional SAs to the scanning ClusterRole.E2E
Config example
Test plan
run-wif-external-cluster.sh gkerun-wif-external-cluster.sh eksrun-wif-external-cluster.sh aksgo test ./controllers/container_image/...