Skip to content

enhancements/authentication: Pod Security Admission Enforcement #1747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

ibihim
Copy link
Contributor

@ibihim ibihim commented Jan 30, 2025

What

A proposal how to safely start enforcing.

Why

It isn't as trivial as we hoped for in 4.11 :)

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 30, 2025
@openshift-ci openshift-ci bot requested a review from jan--f January 30, 2025 17:29
Copy link
Contributor

openshift-ci bot commented Jan 30, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dustymabe for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot requested a review from jwmatthews January 30, 2025 17:29
@ibihim ibihim force-pushed the ibihim/psa-enforcement branch from 12f022d to f95e7a9 Compare February 3, 2025 15:09
@ibihim ibihim changed the title [WIP] enhancements/authentication: Pod Security Admission Enforcement Config enhancements/authentication: Pod Security Admission Enforcement Config Feb 3, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 3, 2025
@ibihim ibihim changed the title enhancements/authentication: Pod Security Admission Enforcement Config enhancements/authentication: Pod Security Admission Enforcement Feb 3, 2025
Comment on lines 213 to 218
// TargetModeConditional indicates that the user is willing to let the cluster
// automatically enforce a stricter enforcement once there are no violating Namespaces.
// If violations exist, the cluster stays in its previous state until those are resolved.
// This allows a gradual move towards label and global config enforcement without
// immediately breaking workloads that are not yet compliant.
TargetModeConditional PSATargetMode = "Conditional"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason why we keep the cluster in the previous state? Is there another middle ground we can use?

The automatic enforcement thing is something that gives me pause and could be prone to some confusion. What if we had something like:

  • Privileged - no pod security restrictions
  • RestrictedExceptViolating - strictest possible enforcement, automatically exempting namespaces that would violate
  • Restricted - strictest possible configuration. Does not rollout if there are violating namespaces.

Also, do we already have an API mapping to the PodSecurity Admission Controller configuration? If not, would having an API surface that maps to that help us come up with something that isn't an all-or-nothing type transition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@ibihim ibihim Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an interesting idea, but how would such a thing work?

I mean it would be pointless, if violating namespaces would be added dynamically to such a list, wouldn't it?
So would it be a list that is set initially and could only be reduced? What if there are like 100 namespaces out of 102 in there? Does it signal to the customer, that they are "enforcing restricted" (false sense of security)?

Copy link
Contributor Author

@ibihim ibihim Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving the customer the capability to exclude NS upfront in a central place (through an API that maps the PodSecurity Configuration), would be an interesting concept, but it would mean that the kube-apiserver needs to roll-out a new version. It would be easier to label the particular namespace by hand with privileged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping to the PodSecurity Admission configuration is linked in the enhancement:

https://github.com/openshift/cluster-kube-apiserver-operator/blob/218530fdea4e89b93bc6e136d8b5d8c3beacdd51/pkg/cmd/render/render.go#L350-L358.

Ah, it looks like we don't allow for any configuration from the customer side to control the granularity of this beyond labeling the namespaces. I do wonder if it would make sense to provide the customer more nuanced control through an API that maps to this with some of our own defaulting logic? Why don't we do this today?

It is an interesting idea, but how would such a thing work?

Haven't gone into detailed thinking of how, but my general line of thinking is instead of all-or-nothing, something like RestrictedExceptViolating could set the default enforcement to restricted with some kind of configuration to not enforce that mode on Namespaces that would violate the restricted enforcement mode. Have to spend some more time thinking on what approach would be best, but I think that could be:

  • configures the exemptions.namespaces in the PodSecurity admission plugin configuration with violating namespaces
  • controller adds label to the violating namespaces to specify that the namespace should be evaluated with the privileged mode.

I mean it would be pointless, if violating namespaces would be added dynamically to such a list, wouldn't it?
So would it be a list that is set initially and could only be reduced? What if there are like 100 namespaces out of 102 in there?

From the motivation section of this enhancement, it sounded like we are pretty confident that the number of violating namespaces are generally pretty low. The scenario you point out is definitely something that could happen, but how likely is it?

Does it signal to the customer, that they are "enforcing restricted" (false sense of security)?

Maybe, but I would expect that we should still have some mechanism in place to bring violating namespaces to the attention of cluster administrators. Is it better to be "secure by default, except for these cases we have made you aware of" or "not secure at all"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed this in one of the recent Arch Calls, right?

It is nearly impossible for automated auditing tools to recognize. Which would make us incompatible with some auditing tools. And this is an important thing for us to avoid.

The OCP Arch Call - 4th of February 2025

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think was specifically related to the exemptions piece. I believe a labeling/annotation approach would be auditable because it would be making an update call to the kube-apiserver to update the namespace labels/annotations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could modify the enhancement, such that the psa label syncer enforces labels, where it works and doesn't where it does not.

Currently, I proposed that if not all of those namespaces would work, we don't proceed to label them at all and wait for those violations to be resolved.

Copy link
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another pass focusing on the new API. Left a couple more ideas for how we may be able to get away from an "automated" transition thing.

The automated transitioning piece is still giving me pause and if we can come up with a solution that achieves the same goal with a fully declarative approach it might result in a better user and maintainer experience

In addition to adjusting how the `OpenShiftPodSecurityAdmission` `FeatureGate` behaves, administrators need visibility and control throughout this transition.
A new API is necessary to provide this flexibility.

#### New API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjenning Asked in slack which group this API would be added to. Would be good to include that information here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the very first pitch in the Arch Call and @JoelSpeed's comment, I would put it into config/v1alpha1. I think this is the right package for configs that should be modifiable by users.

// to "Restricted" on the kube-apiserver.
// This represents full enforcement, where both Namespace labels and the global config
// enforce Pod Security Admission restrictions.
EnforcmentModeFull PSAEnforcementMode = "FullEnforcement"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just Restricted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the "LabelEnforcement" mode, which tells the PSA label syncer to set the enforce label AND on top of that it modifies the PodSecurity configuration of the kube apiserver to restricted by default.

Should I revisit the comment above to make it more explicit?

Comment on lines 209 to 210
// in a fully privileged (pre-rollout) state, ignoring any label enforcement
// or global config changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this mode actually ignore label enforcement?

If I am following https://kubernetes.io/docs/tasks/configure-pod-container/enforce-standards-admission-controller/#configure-the-admission-controller correctly, it seems like the defaults only apply when there are no labels. Is Privileged effectively the same as disabling the PodSecurity admission plugin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If TargetModePrivileged is set, based on what I tried to articulate, is that we effectively disable any restrictions, which means it should behave as it does in all cluster before.

In detail, we could argue about how we would like to do it. I would just revert everything, like removing the enforce label set by the PSA label syncer and set the global config back to privileged.
Logically (based on the name) more correct would be to set the enforce label to privileged. But as mentioned above, I would like to keep the permutations of possible state as minimal as possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand the expected behavior from a user perspective. If I, as a user, were to specify Privileged as the target mode, what happens? If in this state I add a PSa label to a namespace to enforce a different mode like restricted or baseline for that namespace, will it be enforced?

I think there is a distinct difference between "I don't want Pod Security at all" and "I want the most privileged Pod Security default, but still respect anything that would override the default" and I'm trying to make sure I understand which one of those this value is supposed to represent.

Copy link
Contributor Author

@ibihim ibihim Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value currently represents, currently as in this enhancement, the intend to "I don't want Pod Security at all".
From a user perspective it might be more logical to have it actually set to "I want the most privileged Pod Security default"...

... until something goes wrong and they want "the previous state, before it started to fall apart".

I am fine both ways.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the "I don't want Pod Security at all", maybe something like None is more reflective of that and in that case should disable the Pod Security admission plugin all together?

TargetModeConditional PSATargetMode = "Conditional"

// TargetModeRestricted indicates that the user wants the strictest possible
// enforcement, causing the cluster to ignore any existing violations and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of ignoring existing violations, should the cluster operator go degraded with any existing violations being the reason?

Maybe a knob for users to control this would be useful here to support use cases of:

  • Forcefully enable restricted PSA configuration
  • Enable restricted PSA configuration only if no violations (I realize this is what the Conditional enum type is for, but maybe that shouldn't be a separate option?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I elaborated a bit more on my thinking here in #1747 (comment)

// PSATargetMode reflects the user’s chosen (“target”) enforcement level.
type PSATargetMode string

const (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an option that is reflective of the LabelEnforcement enforcement mode that would be specified in the status?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I mentioned it in the Non-Goals.

The reason is, that I want to keep the permutations of possible state as minimal as possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that this question relates to the non goals you outlined. What target mode(s) might result in the LabelEnforcement enforcement mode being populated in the status?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is part of this non-goal.

I am not eager to maintain too many states. Either you are privileged, restricted or on the way progressing forward through "LabelOnly" to "FullEnforcement".

Comment on lines 242 to 253
type PSAEnforcementConfigSpec struct {
// targetMode is the user-selected Pod Security Admission enforcement level.
// Valid values are:
// - "Privileged": ensures the cluster runs with no restrictions
// - "Conditional": defers the decision to cluster-based evaluation
// - "Restricted": enforces the strictest Pod Security admission
//
// If this field is not set, it defaults to "Conditional".
//
// +kubebuilder:default=Conditional
TargetMode PSATargetMode `json:"targetMode"`
}
Copy link
Contributor

@everettraven everettraven Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing from my comment on the Restricted target mode, maybe instead of only being able to specify a single enum value this should be a discriminated union that enables more flexibility in configuration?

To further show my line of thinking:

To represent the Privileged option a user would still do:

targetMode: Privileged

To represent a forceful enablement of Restricted:

targetMode: Restricted
restricted:
  violationPolicy: Ignore
  # room for future configuration options

To represent an enablement of Restricted only if no violations (would go to degraded on failure to enable):

targetMode: Restricted
restricted:
  violationPolicy: Block

To facilitate the transition case that the proposed Conditional targetMode seems to be targeting, maybe the default can be something like:

targetMode: Restricted
restricted:
  violationPolicy: LabelMinimallyRestrictive

Where any violating namespaces encountered will be labeled with the appropriately minimally restrictive PSS mode (which might be privileged), prior to attempting to set the default enforced by the PodSecurity admission controller to Restricted.

Copy link
Contributor Author

@ibihim ibihim Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is slightly more expressive. Maybe something like:

targetMode:
  level: restricted
  violationPolicy: Block

or

targetMode: restricted
violationPolicy: Block

The default could be Block then, which would also work for targetMode: privileged 😄

With labelMinimallyRestrictive... the label syncer isn't able to determine this yet and it changes its view point. Currently it determines the future outlook based on the SA-based RBAC. Now it would need to have another level of concern: testing individual workloads against up to 2 levels (do you satisfy restricted, do you satisfy privileged?).

@ibihim ibihim force-pushed the ibihim/psa-enforcement branch from 5d1a078 to 930c8a4 Compare February 17, 2025 10:44
Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of my questions probably result from a lack of knowledge about PSA and how it actually works/what configurations exist today. Where can I learn more about the various options for it's configuration so that I can better review this EP?

This enhancement introduces a **new cluster-scoped API** and changes to the relevant controllers to rollout [Pod Security Admission (PSA)](https://kubernetes.io/docs/concepts/security/pod-security-admission/) enforcement [in OpenShift](https://www.redhat.com/en/blog/pod-security-admission-in-openshift-4.11).
Enforcement means that the `PodSecurityAdmissionLabelSynchronizationController` sets the `pod-security.kubernetes.io/enforce` label on Namespaces, and the PodSecurityAdmission plugin enforces the `Restricted` [Pod Security Standard (PSS)](https://kubernetes.io/docs/concepts/security/pod-security-standards/) globally on Namespaces without any label.

The new API allows users to either enforce the `Restricted` PSS or maintain `Privileged` PSS for several releases. Eventually, all clusters will be required to use `Restricted` PSS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually, all clusters will be required to use Restricted PSS.

How and why? There will be no get out of jail card at all eventually? Why is this not a cluster admins choice?

Copy link
Contributor Author

@ibihim ibihim Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Secure by default? 😄

We would like to nudge our users to move to the most secure setup at some point. @deads2k suggested that there could be a period, where we tolerate being Privileged and start blocking upgrades after x-releases.

There is no reason to run with a Privileged global setup. A user can set namespaces individually to match the required PSS, if not properly labeled by the PodSecurityAdmissionLabelSynchronizationController (which should only happen in the case that the SCCs are user-based).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a section into Motivation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the motivation of "secure by default", I don't think we should totally prevent a cluster admin from choosing the security profile they desire for their cluster.

I think it is totally reasonable for us to have an opinionated and strongly encouraged default.

Although these numbers are now quite low, it is essential to avoid any scenario where users end up with failing workloads.

To ensure a safe transition, this proposal suggests that if a potential failure of workloads is being detected in release `n`, that the operator moves into `Upgradeable=false`.
The user would need to either resolve the potential failures or set the enforcing mode to `Privileged` for now in order to be able to upgrade.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps they label the violating namespaces as privileged and allow the rest of the cluster to be restricted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will add this option.

If a user encounters `status.violatingNamespaces` it is expected to:

- resolve the violating Namespaces in order to be able to `Upgrade` or
- set the `spec.enforcementMode=Privileged` and solve the violating Namespaces later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a user defers this, how do we know that they will fix it later? What guidance can we provide them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need provide guidance to the customer, the question would be what is the most effective way? Blog entries? This enhancement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would seek the guidance of the appropriate technical writer. I could imagine a combination of release notes and product docs/KCS

#### PodSecurityAdmissionLabelSynchronizationController

The [PodSecurityAdmissionLabelSynchronizationController (PSA label syncer)](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/psalabelsyncer/podsecurity_label_sync_controller.go) must watch the `status.enforcementMode` and the `OpenShiftPodSecurityAdmission` `FeatureGate`.
If `spec.enforcementMode` is `Restricted` and the `FeatureGate` `OpenShiftPodSecurityAdmission` is enabled, the syncer will set the `pod-security.kubernetes.io/enforce` label.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This label goes on each namespace? Or?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On each namespace, except it is:

  • a run-level 0 namespace (default, kube-system,...), an
  • openshift pre-fixed namespace or
  • a namespace which disabled the PSA label syncer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will list those Namespaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarifying this behaviour briefly in the EP would be useful for readers IMO


### Fresh Installs

Needs to be evaluated. The System Administrator needs to pre-configure the new API’s `spec.enforcementMode`, choosing whether the cluster will be `Privileged` or `Restricted` during a fresh install.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not have the installer generate the correct (Restricted) configuration to be installed? Then most people won't care, but if someone really wanted to, they could pause at the manifests stage and edit the resource to Privileged instead

Copy link
Contributor Author

@ibihim ibihim Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, if the default will be Restricted, it would default to enforcement, right? The question that is left open for me is: if a customer wants to default to Privileged on fresh installs, is there an option to do it? Is there some documentation wrt? Or is it common to let the SysAdmin change that config post-installation when setting up a cluster?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have this ability already generally. Lets say we did nothing here, then the cluster installs, the resource gets generated and the enforcement field is empty. They can now go set that to privileged and continue with their cluster until we force them backwards.

Or, we can enable them to change this at install time. Which comes in two forms.

We add an explicit option to the install config, and plumb the value through.

Or, we generate the manifests and document that a modified manifest at the manifests stage could be used to change the behaviour for fresh clusters.

The first of the install time options requires the second to be implemented anyway.

Setting it from the installer in this manner will allow us then to have a distinction between clusters that were installed with PSA restricted, vs those that upgraded and had no opinion. I suspect there is some value in that 🤔


While the root causes need to be identified in some cases, the result of identifying a violating Namespace is understood.

#### New SCC Annotation: `security.openshift.io/ValidatedSCCSubjectType`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: fix casing to match change

- List which namespaces aren't managed by the PSA label syncer
- Explain, why we want to force restricted PSA enforcement eventually.
- Add a guide on how to handle violations.
@ibihim
Copy link
Contributor Author

ibihim commented Feb 24, 2025

A lot of my questions probably result from a lack of knowledge about PSA and how it actually works/what configurations exist today. Where can I learn more about the various options for it's configuration so that I can better review this EP?

@JoelSpeed, I tried to set as many links as possible. Especially at the beginning, but it doesn't seem to be obvious as you are not the first to ask 😄

Upstream resources:

Downstream resources:

  • Initial blog post from the authors of the PSA label syncer controller: blog post.
  • Integrating PSA into OpenShift: enhancement.
  • The enhancement for the PSA label syncer: enhancement.
  • OpenShift Docs for the user perspective: docs.
  • Some other Red Hat labeled blog post wrt PSA and SCCs: blog.

- Are prefixed with `openshift`,
- Have the label `security.openshift.io/scc.podSecurityLabelSync=false`.
- Have the `pod-security.kubernetes.io/enforce` label set manually.
- Are not a run-level zero Namespace:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Are not a run-level zero Namespace:
- Are a run-level zero Namespace:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the label syncer doesn't ignore run-level zero (or one) namespaces explicitly; it just happens that the ones listed here are run-level 0.

Having said that, admission plugins (and also PSa) are not enabled for runlevel 0/1 namespaces, so any labels will be ineffective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will correct the whole list to reflect this: https://github.com/openshift/enhancements/pull/1747/files#r1971301549


To ensure a safe transition, this proposal suggests that if a potential failure of workloads is being detected in release `n`, that the operator moves into `Upgradeable=false`.
The user would need to either resolve the potential failures, setting a higher PSS label for that Namespace or set the enforcing mode to `Privileged` for now in order to be able to upgrade.
`Privileged` will keep the cluster in the previous state, the non enforcing state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth elaborating on what the "non enforcing state" entails. Does this mean the default global enforcement is privileged but cluster admins can make certain namespaces more restrictive? Or does it mean that there is no pod security enforcement at all (i.e PodSecurity admission plugin disabled)?

Comment on lines 48 to 50
The temporary `Privileged` mode (opt-out from PSA enforcement) exists solely to facilitate a smooth transition. Once a vast majority of clusters have adapted their workloads to operate under `Restricted` PSS, maintaining the option to run in `Privileged` mode would undermine these security objectives.

OpenShift strives to offer the highest security standards. Enforcing PSS ensures that OpenShift is at least as secure as upstream Kubernetes and that OpenShift complies with upstreams security best practices.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't upstream Kubernetes ship with privileged global enforcement by default? Even though it is strongly recommended to set a more restrictive policy, ultimately it's up to the cluster administrator.

Are there any use cases that customers may have that could warrant an option to set the cluster to a less restrictive global state than Restricted? Maybe there is a middle ground that allows cluster administrators to do this, but they explicitly acknowledge that they are moving to an explicitly unsupported state?

Something else to think about purely from an API change perspective - removing an allowed value for a field is going to be a breaking change to users and will likely require an explicit admin action when we want to remove it as an allowed value.


By adding these annotations, the [`PodSecurityReadinessController`](https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/pkg/operator/podsecurityreadinesscontroller/podsecurityreadinesscontroller.go) can more accurately identify potentially failing Namespaces and understand their root causes:

- With the `security.openshift.io/MinimallySufficientPodSecurityStandard` annotation, it can evaluate Namespaces that lack the `pod-security.kubernetes.io/enforce` label but have user-overridden warn or audit labels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the label syncer will ignore updating namespaces with user-overridden labels. Will we keep that behavior moving forward? If a namespace has user-overridden warn and/or audit labels, will we still set the enforce label?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And how does it detect that a namespace is user-overridden, vs, out of date from something it already synced

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@everettraven @JoelSpeed , JFYI, https://issues.redhat.com/browse/AUTH-413 implemented the handling of user-overridden pss labels via managedFields. If user overrides one lable, e.g. warn, then the remaining audit and enforce labels are still auto synced:

$ oc create ns xxia-test
$ oc get ns xxia-test -o yaml | grep -E '(warn|audit|enforce):'
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/warn: restricted
$ oc label --overwrite ns xxia-test pod-security.kubernetes.io/warn=baseline
$ oc adm policy add-scc-to-user privileged -z default -n xxia-test
$ oc get ns xxia-test -o yaml | grep -E '(warn|audit|enforce):'
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/warn: baseline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks something like this:

  • If there is at least one label that we potentially own, consider this Namespace.

The label syncer owns a label:

  • If it is not kube-system, kube-public, default or openshift.
  • If its name isn't prefixed with openshift-.
  • If security.openshift.io/scc.podSecurityLabelSync=false isn't set.
  • If at least one PSA labels (including enforce) isn't set by the customer.
  • If a customer sets all PSA labels, but set security.openshift.io/scc.podSecurityLabelSync=true

How do we identify the owner of the field? As mentioned by @xingxingxia, we watch the managed fields:

oc get ns $NAMESPACE -oyaml --show-managed-fields

Ownership defined here:

  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          f:pod-security.kubernetes.io/audit: {}
          f:pod-security.kubernetes.io/audit-version: {}
          f:pod-security.kubernetes.io/warn: {}
          f:pod-security.kubernetes.io/warn-version: {}
    manager: pod-security-admission-label-synchronization-controller
    operation: Apply
    time: "2025-02-24T13:50:40Z"

Copy link
Contributor Author

@ibihim ibihim Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a section, where this is being discussed. I will update that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! So, in theory we could go down the route of labelling clusters automatically as privileged and then later update them to restricted to avoid the race condition talked about elsewhere

Though the conversation on the arch call lead me to believe that maybe that wasn't necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could label namespaces automatically with privileged by the label syncer. The whole cluster is already set to privileged in the PodSecurity Configuration given to the kube-apiserver.

The Arch Call, AFAI, suggested that before enabling the global enforcement through the PodSecurity configuration given to the kube-apiserver on namespaces without a pod-security.kubernetes.io/enforce label, we could poll.

We poll for namespaces without pod-security.kubernetes.io/enforce and verify that they are working with the PSS Restricted, if not, we will block the rollout of that configuration, until it does.
According to a quick calculation by @deads2k, this should only take a couple of minutes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with that approach, lets go that way

### New API

This API is used to support users to enforce PSA.
As this is a transitory process, this API will loose its usefulness once PSA enforcement isn't optional anymore.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to previous comments I've made, I think it might be worth keeping this API around to allow cluster admins flexibility in how they configure the PodSecurity admission plugin. I think it is fine for us to have opinionated and encouraged defaults.

Have we thought through different use cases that admins may have and why they may or may not want to be able to configure the global enforcement mode?

Copy link
Contributor

@everettraven everettraven Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I think the introduction of an API means we have to support it. If we don't want to support a new API long term, is there an alternative approach that does not use an API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David's suggestion is to use LegacyConfig, which would infer that the field is introduced and deprecated immediately, and can then go through the deprecation cycle

Copy link
Contributor Author

@ibihim ibihim Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't imagine a use-case, where someone wouldn't want to have the highest possible security setting on their production cluster.
If I would imagine that I have a test-cluster, this could be annoying to deal with, if the legacyMode is gone. But it isn't like I could not add a mechanism to auto-label namespaces with another PSS.


#### PodSecurityAdmissionLabelSynchronizationController

The [PodSecurityAdmissionLabelSynchronizationController (PSA label syncer)](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/psalabelsyncer/podsecurity_label_sync_controller.go) must watch the `status.enforcementMode` and the `OpenShiftPodSecurityAdmission` `FeatureGate`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The [PodSecurityAdmissionLabelSynchronizationController (PSA label syncer)](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/psalabelsyncer/podsecurity_label_sync_controller.go) must watch the `status.enforcementMode` and the `OpenShiftPodSecurityAdmission` `FeatureGate`.
The [PodSecurityAdmissionLabelSynchronizationController (PSA label syncer)](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/psalabelsyncer/podsecurity_label_sync_controller.go) must watch the `spec.enforcementMode` and the `OpenShiftPodSecurityAdmission` `FeatureGate`.

?

Copy link
Contributor Author

@ibihim ibihim Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I would assume, based on the current state of the enhancement, that spec indicates the intend and status is its actual state aka green-lighting of the intend. So the psa label syncer would not care what the intend is, but what the PodSecurityReadinessController signals in the status.

In the Upgradeable=false scenario, this shouldn't be the case as the cluster wouldn't upgrade, if the spec doesn't reflect the doable state and in the auto-legacy-mode-switching scenario, the "status" would be the spec and vice versa...

So I think you are correct.


The [PodSecurityAdmissionLabelSynchronizationController (PSA label syncer)](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/psalabelsyncer/podsecurity_label_sync_controller.go) must watch the `status.enforcementMode` and the `OpenShiftPodSecurityAdmission` `FeatureGate`.
If `spec.enforcementMode` is `Restricted` and the `FeatureGate` `OpenShiftPodSecurityAdmission` is enabled, the syncer will set the `pod-security.kubernetes.io/enforce` label on Namespaces that it manages.
Otherwise, it will refrain from setting that label and remove any enforce labels it owns if existent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other changes that need to be made here?

How should the label syncer handle user-overridden audit/warn labels when deciding whether or not to set the enforce label?

Does any evaluation on the existing workloads in the namespace need to take place to determine if setting the enforce label will break workloads? If setting the enforce label will break workloads in the namespace, how should the controller react?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should the label syncer handle user-overridden audit/warn labels when deciding whether or not to set the enforce label?

User labels are ignored, see here: https://github.com/openshift/enhancements/pull/1747/files#r1971301549

Does any evaluation on the existing workloads in the namespace need to take place to determine if setting the enforce label will break workloads? If setting the enforce label will break workloads in the namespace, how should the controller react?

It is not the concern of the controller to identify if the workload would break or not. The concern is to decide what label it thinks is appropriate and set if.
It is the concern of the PodSecurityReadinessController to identify if it is safe to enforce PSA.

ValidatedSCCSubjectTypeAnnotation = "security.openshift.io/validated-scc-subject-type"
```

This annotation will be set by the [`SecurityContextConstraint` admission](https://github.com/openshift/apiserver-library-go/blob/60118cff59e5d64b12e36e754de35b900e443b44/pkg/securitycontextconstraints/sccadmission/admission.go#L138) plugin.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also explain that this will be a pod annotation.


#### Set PSS Annotation: `security.openshift.io/MinimallySufficientPodSecurityStandard`

The PSA label syncer must set the `security.openshift.io/MinimallySufficientPodSecurityStandard` annotation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also explain that this will be a namespace annotation.

Over the last few releases, the number of clusters with violating workloads has dropped significantly.
Although these numbers are now quite low, it is essential to avoid any scenario where users end up with failing workloads.

To ensure a safe transition, this proposal suggests that if a potential failure of workloads is being detected in release `n`, that the operator moves into `Upgradeable=false`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which operator? (this only gets clarified later on in the enhancement, maybe worth mentioning here)

- Are not a run-level zero Namespace:
- `kube-system`,
- `default` or
- `kube-public`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Are prefixed with `openshift`,
- Have the label `security.openshift.io/scc.podSecurityLabelSync=false`.
- Have the `pod-security.kubernetes.io/enforce` label set manually.
- Are not a run-level zero Namespace:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the label syncer doesn't ignore run-level zero (or one) namespaces explicitly; it just happens that the ones listed here are run-level 0.

Having said that, admission plugins (and also PSa) are not enabled for runlevel 0/1 namespaces, so any labels will be ineffective.

// PSAEnforcementConfig is a config that supports the user in the PSA enforcement transition.
// The spec struct enables a user to stop the PSA enforcement, if necessary.
// The status struct supports the user in identifying obstacles in PSA enforcement.
type PSAEnforcementConfig struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LegacyConfig

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely understand that a better name for the PSAEnforcmentModePrivileged would be PSALegacyMode or something like that. But the whole config?

Let me make an assumption: We call it LegacyConfig because it supports a legacy state of the cluster? If we would keep its name as LegacyConfig, would it imply that other things might be added that are not part of PSA?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the suggestion was to create a type LegacyConfig, and then we could, over time add various sections that demonstrate previous legacy configuration, so PSA would be a subsection of the spec on this new type in the config API group

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we still looking for this to be a top-level LegacyConfig API that is a catch all for scenarios like this or are we leaning back towards a PSA specific API?

@ibihim
Copy link
Contributor Author

ibihim commented Feb 26, 2025

In the Arch Call, the proposal was made, that we would do not use Upgradable=false and set the spec.enforcementMode=Privileged automatically for clusters that don't qualify.

Copy link
Contributor

openshift-ci bot commented Feb 26, 2025

@ibihim: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/markdownlint 714dda5 link true /test markdownlint

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ibihim
Copy link
Contributor Author

ibihim commented Feb 27, 2025

I have a discussion topic with regards to PSA State Transitions:

Scenario 1:

API is set to Restricted, but a violation appears temporarily and then disappears (e.g., from a periodic job). Should the controller:

  • Immediately switch to Legacy and remain there permanently?
  • Switch to Legacy temporarily and automatically return to Restricted after violations clear?

Scenario 2:
API is in Legacy mode (either user-set or automatically due to violations) and all violations disappear. Should the controller:

  • Automatically transition to Restricted mode?
  • Stay in Legacy mode until manual user intervention?

Concerns:

  • Prevent some awkward oscillation.
  • Builds that run periodically and are violating.

Thoughts

  • It would be safer to lock in on Legacy, but it would cause a lot of false positives.
  • We could add a "cooling off" period, that violation blips are gone after 2 weeks.

@openshift-bot
Copy link

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 10, 2025
@openshift-bot
Copy link

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants