From 8f690cd8dad5e23a7e48ef34b1b7b87312a02b30 Mon Sep 17 00:00:00 2001 From: Alessio Biancalana Date: Thu, 20 Nov 2025 14:41:10 +0100 Subject: [PATCH 1/4] docs: add RFC about CRD naming and policy lifecycle --- docs/rfc/0004-crds-policy-lifecycle.md | 319 +++++++++++++++++++++++++ 1 file changed, 319 insertions(+) create mode 100644 docs/rfc/0004-crds-policy-lifecycle.md diff --git a/docs/rfc/0004-crds-policy-lifecycle.md b/docs/rfc/0004-crds-policy-lifecycle.md new file mode 100644 index 00000000..6e310fe2 --- /dev/null +++ b/docs/rfc/0004-crds-policy-lifecycle.md @@ -0,0 +1,319 @@ +| | | +| :----------- | :---------------------------------------------------- | +| Feature Name | CRD revisit and user workflow | +| Start Date | 20 Nov 2025 | +| Category | CRDs | +| RFC PR | https://github.com/neuvector/runtime-enforcer/pull/45 | +| State | **ACCEPTED** | + +# Summary + +[summary]: #summary + +First RFC, which doubles as a template. + +# Motivation + +[motivation]: #motivation + +Before implementing a runtime enforcement workflow, in this post-POC phase we want to reach consensus on two topics: + +- Kubernetes' CRD names and specifications +- The user journey and workflow, especially when not by a UI of some sort + +## Examples / User Stories + +[examples]: #examples + +The following user stories are to be intended as examples: + +- As a user I want to configure a security policy for a given workload +- As a user I want the processes that run into my workloads to be learned automatically and be proposed to me +- As a user I want to inherit the security policy for my workload from a pre-existing template +- As a user I want to promote a policy proposal to an actual deployed security policy + +# Detailed design + +[design]: #detailed-design + +## CRDs Overview +This is a quick overview of all the CRDs we’re going to define. Each one of them is going to be described in depth in the next sections. + +| CRD Current Name | CRD New Name | Description | +| --- | --- | --- | +| WorkloadSecurityPolicyProposal | WorkloadPolicyProposal | Proposed policies learned from workload behavior. Now includes per-container rules. | +| WorkloadSecurityPolicy | WorkloadPolicy | Defines the enforcement policy (monitor/protect) for a workload, grouping per-container rules or image references. | +| ClusterWorkloadSecurityPolicy | (Removed) | Replaced by ImagePolicy for cluster-wide reusable profiles. | +| (New) | ImagePolicy | Defines reusable runtime rules (templates) based on container image, used for policy templating. | + + +Changes from the previous version: +- The WorkloadSecurityPolicy was renamed into WorkloadPolicy +- The WorkloadSecurityPolicyTuning was deleted and replaced by the status in the WorkloadPolicy resource. + +## Learning Phase + +During learning mode, we create WorkloadPolicyProposal resources. These resources are structured in this way: + +```yaml +apiVersion: security.rancher.io/v1alpha1 +kind: WorkloadPolicyProposal +metadata: + name: workloadpolicyproposal-sample + ownerReferences: + - apiVersion: apps/v1 + kind: Deployment + name: pgsql-8646457455 + uid: 39a32022-4c8f-424e-a8b6-3c92af3acb2e +spec: + selector: + matchLabels: + app: postgres + rulesByContainer: + "db-migration": # rules applied to the container named "db-migration" + executables: + allowed: + - /bin/bash + - /usr/bin/psql + "postgres": # rules applied to the container named "postgres" + executables: + allowed: + - /usr/bin/psql + "otel-collector": # rules applied to the container named "otel-collector" + executables: + allowed: + - /usr/bin/otel-collector +``` + +Changes compared to the current implementation: +- The rules section has been replaced by rulesByContainer. This new field holds a map with the name of the containers as key, and the list of the container rules as value. + +Notes on the behavior: + +- The WorkloadPolicyProposal has an ownerReference that ties it back to the workload resource for which the behaviour was observed. +- When the observed workload is deleted, the associated WorkloadPolicyProposal is deleted as well. +- When we switch from a proposal to a real policy we delete the proposal and don’t recreate it again +- In case of workload rollout, the WorkloadPolicyProposal continues to learn like nothing happened. + +## The WorkloadPolicy resource +Policies are defined using the WorkloadPolicy resource. This is how this resource looks: + +```yaml +apiVersion: security.rancher.io/v1alpha1 +kind: WorkloadPolicy +metadata: + name: postgres-policy + namespace: default +spec: + mode: monitor # monitor | protect + selector: + matchLabels: + app: postgres + rulesByContainer: + postgres: + rules: + executables: + allowed: + - /usr/bin/psql + otel-collector: + rules: + executables: + allowed: + - /usr/bin/otel-collector + db-migration: + rules: + executables: + allowed: + - /bin/bash + - /usr/bin/psql +``` + +Changes compared to the current implementation: + +- The rules section has been replaced by rulesByContainer. This new field holds a map with the name of the containers as key, and the list of the container rules as value. + +Notes on the behavior: + +- When the enforced workload is deleted, the WorkloadPolicy is still alive; it should be deleted manually +- In case of workload rollout, the WorkloadPolicy remains unchanged. If it causes issues with the rollout, the user is in charge of rolling back to the previous version or destroying the policy + +## Binding a WorkloadPolicy +A workload is protected by a WorkloadPolicy through a podSelector, like in the current approach. +As proposed in the previous version, we suggest the usage of a unique label security.rancher.io/policy, but we don’t enforce it by default since putting it in the spec.template would cause a rollout. + +So the difference with the previous version is that we simply leave users to choose their preferred approach. Having a dedicated label is still suggested, but not compulsory. + +- Basic user -> use default k8s workload selectors -> everything works out of the box, no rollout required. +- Advanced user (real production scenario) -> enforce a unique label on workloads and use this label as a selector -> a rollout could be required if the workload was initially created without the label + +Now that the label is no longer compulsory, we cannot rely on it to understand if a workload is covered or not; we should fall back to a kubectl plugin that scrapes the resources and helps the user to understand the situation (potential conflict, partial workload coverage,...). + +Users can still rely on the unique label if they choose to use it, and so simple kubectl commands. Our kubectl plugin should be generic and also cover cases where the label is not used. + +## Using the ImagePolicy to inherit rules from pre-made templates + +Pods are made by containers, each one of them running a container image. The same container image can be reused by multiple Pods, but its runtime behavior is mostly the same. + +Most of the time, a Redis/Tomcat/NodeJS container image is always going to behave in the same way. There could be some exceptions, we must take that scenario into account. + +SUSE is already distributing maintained container images through AppCo. It would make sense to tie our profiles to the container images, rather than thinking about the concept of “workload”. + +Let's define an ImagePolicy: + +```yaml +apiVersion: security.rancher.io/v1alpha1 +kind: ImagePolicy +metadata: + name: otel-collector +spec: + image: # optional - inspired by SBOMScanner's imageMetadata + registry: "registry.suse.com" + repository: "otel-collector" + tag: "v1.0.0" + digest: "sha256:1234567890" + rules: + executables: + allowed: + - /usr/bin/otel-collector +``` + +Then it can be consumed by a WorkloadPolicy in this way: + +```yaml +apiVersion: security.rancher.io/v1alpha1 +kind: WorkloadSecurityPolicy +metadata: + name: postgres-policy + namespace: default +spec: + mode: monitor # monitor | protect + containerRules: + postgres: + rules: + executables: + allowed: + - /usr/bin/psql + otel-collector: + imagePolicyRef: otel-collector # name of the ImagePolicy + db-migration: + rules: + executables: + allowed: + - /bin/bash + - /usr/bin/psql +``` + +When defining the rules of a container, the user can either define a list of explicit rules or can make a reference to an existing ImagePolicy by using the `imagePolicyRef` attribute. In its first implementation it will not be possible to define both `rules` and `imagePolicyRef` for the same container. + +To avoid uncertainty we must: + +- Introduce a ValidatingWebhook that ensures all the ImagePolicy objects referenced by WorkloadSecurityPolicy exist. The webhook would process CREATE and UPDATE events. +- Add a finalizer to each ImagePolicy, the deletion of an ImagePolicy resource must be allowed only when no WorkloadSecurityPolicy is referencing it. + +ImagePolicy resources aren't namespaced; they are cluster-wide available resources that can be referenced by any other resource. + +## Handling Violations in Monitor/Protect Mode + +When a WorkloadPolicy is in monitor or protect mode, the runtime enforcer generates violation notifications (aka processes that are not on the allow list). The difference is that in monitor mode, the violations are still allowed, while in protect mode, they are blocked. + +A notification is sent to the Security Hub in the form of an OpenTelemetry event. + +In this version we are going to create a new CRD related to the tuning aspects of a WorkloadPolicy, that holds the violation data for the policy while the policy is set in **monitor** mode. + +When the policy is in protect mode, the only way of getting a notification about attempted violations will be OpenTelemetry events. + +At the moment, the idea is to use the tuning CRD in order to save space on the WorkloadPolicy one. + +```yaml +apiVersion: security.rancher.io/v1alpha1 +kind: WorkloadPolicyTuning +metadata: + name: postgres-policy + namespace: default +spec: + # ... +status: + violations: + lastObservedTimestamp: "2025-11-14T17:40:00Z" + totalViolations: 42 + latestEvents: + - containerName: postgres + executable: /usr/bin/wget + timestamp: "2025-11-14T17:39:50Z" + - containerName: db-migration + executable: /bin/sh + timestamp: "2025-11-14T17:39:55Z" +``` + +The design is not definitive, but the idea is: + +- Users without the UI will simply update the tuning resource manually if they want to tolerate some violations +- The rancher extension will use this status to run a kubectl patch with the desired changes based on the user input. + +An alternative design with a map of unique violations could be the following: + +```yaml +status: + violations: + lastObservedTimestamp: "2025-11-14T17:40:00Z" + totalViolations: 42 + containerViolations: + postgres: + "/usr/bin/wget": + count: 15 + lastObservedMode: protect + lastObservedTimestamp: "2025-11-14T17:39:50Z" + "/usr/local/bin/curl": + count: 1 + lastObservedMode: monitor + lastObservedTimestamp: "2025-11-14T17:40:00Z" + db-migration: + "/bin/sh": + count: 27 + lastObservedMode: monitor + lastObservedTimestamp: "2025-11-14T17:39:55Z" +``` + +At this stage we don't want to commit on the name of the WorkloadPolicyTuning resource as we might come up with a better name later, and we will for sure revisit at least the naming of the resource. We decided to defer that to a dedicated RFC when we get to implement tuning for policies. + +# Drawbacks + +[drawbacks]: #drawbacks + +We didn't observe any particular drawback in the workflow. Anyway, there are considerations to make: + +- Having rules specified by container will allow us for more granularity and will allow us to support more scenarios (init-containers, sidecars), on the other hand it will have a performance impact that we'll have to measure and document. + +# Alternatives + +[alternatives]: #alternatives + +We considered a bunch of alternatives. For example putting the ImagePolicy and the WorkloadPolicy together: + +```yaml +apiVersion: security.rancher.io/v1alpha1 +kind: WorkloadSecurityPolicy +metadata: + name: database +spec: + mode: monitor # monitor/protect + selector: + matchLabels: + app: postgres + policies: + # ImagePolicy profile to apply to the the container named "db-migration" + "db-migration": psql-init + "postgres": psql + "otel-collector": otel-sidecar +``` + +But it didn't work out because this way it becomes very hard to achieve the granularity we wanted, even for a first POC that could resist to time. + +We also tried experimenting with applying annotations to pods referencing directly the ImagePolicy, but didn't lead us to any good-enough conclusion. + +# Unresolved questions + +[unresolved]: #unresolved-questions + +- How do we name the policy tuning CRD? + From 216dd12aca29137d62d23644eb24cc3cf3df35e8 Mon Sep 17 00:00:00 2001 From: Alessio Biancalana Date: Fri, 21 Nov 2025 15:33:47 +0100 Subject: [PATCH 2/4] fixup! docs: add RFC about CRD naming and policy lifecycle --- docs/rfc/0004-crds-policy-lifecycle.md | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/docs/rfc/0004-crds-policy-lifecycle.md b/docs/rfc/0004-crds-policy-lifecycle.md index 6e310fe2..6a9c99ce 100644 --- a/docs/rfc/0004-crds-policy-lifecycle.md +++ b/docs/rfc/0004-crds-policy-lifecycle.md @@ -49,7 +49,6 @@ This is a quick overview of all the CRDs we’re going to define. Each one of th Changes from the previous version: - The WorkloadSecurityPolicy was renamed into WorkloadPolicy -- The WorkloadSecurityPolicyTuning was deleted and replaced by the status in the WorkloadPolicy resource. ## Learning Phase @@ -59,16 +58,13 @@ During learning mode, we create WorkloadPolicyProposal resources. These resource apiVersion: security.rancher.io/v1alpha1 kind: WorkloadPolicyProposal metadata: - name: workloadpolicyproposal-sample + name: deploy-pgsql-8646457455 # - ownerReferences: - apiVersion: apps/v1 kind: Deployment name: pgsql-8646457455 uid: 39a32022-4c8f-424e-a8b6-3c92af3acb2e spec: - selector: - matchLabels: - app: postgres rulesByContainer: "db-migration": # rules applied to the container named "db-migration" executables: @@ -102,13 +98,10 @@ Policies are defined using the WorkloadPolicy resource. This is how this resourc apiVersion: security.rancher.io/v1alpha1 kind: WorkloadPolicy metadata: - name: postgres-policy + name: deploy-pgsql-8646457455 namespace: default spec: mode: monitor # monitor | protect - selector: - matchLabels: - app: postgres rulesByContainer: postgres: rules: @@ -138,15 +131,12 @@ Notes on the behavior: - In case of workload rollout, the WorkloadPolicy remains unchanged. If it causes issues with the rollout, the user is in charge of rolling back to the previous version or destroying the policy ## Binding a WorkloadPolicy -A workload is protected by a WorkloadPolicy through a podSelector, like in the current approach. -As proposed in the previous version, we suggest the usage of a unique label security.rancher.io/policy, but we don’t enforce it by default since putting it in the spec.template would cause a rollout. - -So the difference with the previous version is that we simply leave users to choose their preferred approach. Having a dedicated label is still suggested, but not compulsory. +A workload is protected by a WorkloadPolicy through a podSelector. We suggest the usage of a unique label security.rancher.io/policy, but we don’t enforce it by default since putting it in the spec.template would cause a rollout. - Basic user -> use default k8s workload selectors -> everything works out of the box, no rollout required. - Advanced user (real production scenario) -> enforce a unique label on workloads and use this label as a selector -> a rollout could be required if the workload was initially created without the label -Now that the label is no longer compulsory, we cannot rely on it to understand if a workload is covered or not; we should fall back to a kubectl plugin that scrapes the resources and helps the user to understand the situation (potential conflict, partial workload coverage,...). +Since the label is not compulsory, we cannot rely on it to understand if a workload is covered or not; we should use a kubectl plugin that scrapes the resources and helps the user to understand the situation (potential conflict, partial workload coverage,...). Users can still rely on the unique label if they choose to use it, and so simple kubectl commands. Our kubectl plugin should be generic and also cover cases where the label is not used. @@ -156,7 +146,7 @@ Pods are made by containers, each one of them running a container image. The sam Most of the time, a Redis/Tomcat/NodeJS container image is always going to behave in the same way. There could be some exceptions, we must take that scenario into account. -SUSE is already distributing maintained container images through AppCo. It would make sense to tie our profiles to the container images, rather than thinking about the concept of “workload”. +Vendors alreadu distribute maintained container images through their platforms. It would make sense to tie our profiles to the container images, rather than thinking about the concept of “workload”. Let's define an ImagePolicy: @@ -187,14 +177,16 @@ metadata: namespace: default spec: mode: monitor # monitor | protect - containerRules: + rulesByContainer: postgres: rules: executables: allowed: - /usr/bin/psql otel-collector: - imagePolicyRef: otel-collector # name of the ImagePolicy + rules: + executables: + imagePolicyRef: otel-collector # name of the ImagePolicy db-migration: rules: executables: From 66030c254c79d1d27c4adb6082020273f9cfea9b Mon Sep 17 00:00:00 2001 From: Alessio Biancalana Date: Mon, 24 Nov 2025 15:20:43 +0100 Subject: [PATCH 3/4] fixup! docs: add RFC about CRD naming and policy lifecycle --- docs/rfc/0004-crds-policy-lifecycle.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/docs/rfc/0004-crds-policy-lifecycle.md b/docs/rfc/0004-crds-policy-lifecycle.md index 6a9c99ce..38fb992f 100644 --- a/docs/rfc/0004-crds-policy-lifecycle.md +++ b/docs/rfc/0004-crds-policy-lifecycle.md @@ -10,7 +10,7 @@ [summary]: #summary -First RFC, which doubles as a template. +This RFC tries to summarize the disccusion happened to far about the policy lifecycle, and tries to also stabilize CRDs in terms of lifecycle, names, and possible interactions. # Motivation @@ -39,13 +39,12 @@ The following user stories are to be intended as examples: ## CRDs Overview This is a quick overview of all the CRDs we’re going to define. Each one of them is going to be described in depth in the next sections. -| CRD Current Name | CRD New Name | Description | -| --- | --- | --- | -| WorkloadSecurityPolicyProposal | WorkloadPolicyProposal | Proposed policies learned from workload behavior. Now includes per-container rules. | -| WorkloadSecurityPolicy | WorkloadPolicy | Defines the enforcement policy (monitor/protect) for a workload, grouping per-container rules or image references. | -| ClusterWorkloadSecurityPolicy | (Removed) | Replaced by ImagePolicy for cluster-wide reusable profiles. | -| (New) | ImagePolicy | Defines reusable runtime rules (templates) based on container image, used for policy templating. | - +| CRD Current Name | CRD New Name | Description | +| ------------------------------ | ---------------------- | ------------------------------------------------------------------------------------------------------------------ | +| WorkloadSecurityPolicyProposal | WorkloadPolicyProposal | Proposed policies learned from workload behavior. Now includes per-container rules. | +| WorkloadSecurityPolicy | WorkloadPolicy | Defines the enforcement policy (monitor/protect) for a workload, grouping per-container rules or image references. | +| ClusterWorkloadSecurityPolicy | (Removed) | Replaced by ImagePolicy for cluster-wide reusable profiles. | +| (New) | ImagePolicy | Defines reusable runtime rules (templates) based on container image, used for policy templating. | Changes from the previous version: - The WorkloadSecurityPolicy was renamed into WorkloadPolicy @@ -146,7 +145,7 @@ Pods are made by containers, each one of them running a container image. The sam Most of the time, a Redis/Tomcat/NodeJS container image is always going to behave in the same way. There could be some exceptions, we must take that scenario into account. -Vendors alreadu distribute maintained container images through their platforms. It would make sense to tie our profiles to the container images, rather than thinking about the concept of “workload”. +Vendors already distribute maintained container images through their platforms. It would make sense to tie our profiles to the container images, rather than thinking about the concept of “workload”. Let's define an ImagePolicy: From 5b4c2f5c8cc1eada73fd951fba1d3cc2555423ab Mon Sep 17 00:00:00 2001 From: Alessio Biancalana Date: Tue, 25 Nov 2025 17:33:18 +0100 Subject: [PATCH 4/4] fixup! docs: add RFC about CRD naming and policy lifecycle --- docs/rfc/0004-crds-policy-lifecycle.md | 63 ++++++++++++++++++++------ 1 file changed, 50 insertions(+), 13 deletions(-) diff --git a/docs/rfc/0004-crds-policy-lifecycle.md b/docs/rfc/0004-crds-policy-lifecycle.md index 38fb992f..4b14e351 100644 --- a/docs/rfc/0004-crds-policy-lifecycle.md +++ b/docs/rfc/0004-crds-policy-lifecycle.md @@ -12,6 +12,8 @@ This RFC tries to summarize the disccusion happened to far about the policy lifecycle, and tries to also stabilize CRDs in terms of lifecycle, names, and possible interactions. +This RFC supersedes [RFC-001](https://github.com/neuvector/runtime-enforcer/blob/main/docs/rfc/0001-workloadgroup.md). + # Motivation [motivation]: #motivation @@ -45,9 +47,12 @@ This is a quick overview of all the CRDs we’re going to define. Each one of th | WorkloadSecurityPolicy | WorkloadPolicy | Defines the enforcement policy (monitor/protect) for a workload, grouping per-container rules or image references. | | ClusterWorkloadSecurityPolicy | (Removed) | Replaced by ImagePolicy for cluster-wide reusable profiles. | | (New) | ImagePolicy | Defines reusable runtime rules (templates) based on container image, used for policy templating. | +| (New) | ContainerPolicy | Defines rules that will be used to handle sidecar containers at a cluster level. | Changes from the previous version: -- The WorkloadSecurityPolicy was renamed into WorkloadPolicy +- The WorkloadSecurityPolicy was renamed into `WorkloadPolicy` +- We have new CRDs for the `ImagePolicy` and the `ContainerPolicy` +- The `ClusterWorkloadSecurityPolicy` has been removed ## Learning Phase @@ -57,11 +62,11 @@ During learning mode, we create WorkloadPolicyProposal resources. These resource apiVersion: security.rancher.io/v1alpha1 kind: WorkloadPolicyProposal metadata: - name: deploy-pgsql-8646457455 # - + name: statefulsets-pgsql # - ownerReferences: - - apiVersion: apps/v1 - kind: Deployment - name: pgsql-8646457455 + - apiVersion: v1 + kind: StatefulSet + name: pgsql uid: 39a32022-4c8f-424e-a8b6-3c92af3acb2e spec: rulesByContainer: @@ -85,9 +90,9 @@ Changes compared to the current implementation: Notes on the behavior: -- The WorkloadPolicyProposal has an ownerReference that ties it back to the workload resource for which the behaviour was observed. +- The WorkloadPolicyProposal has an `ownerReference` that ties it back to the workload resource for which the behaviour was observed. - When the observed workload is deleted, the associated WorkloadPolicyProposal is deleted as well. -- When we switch from a proposal to a real policy we delete the proposal and don’t recreate it again +- When we switch from a `WorkloadPolicyProposal` to an actual `WorkloadPolicy` we delete the `WorkloadPolicyProposal` and don’t recreate it again - In case of workload rollout, the WorkloadPolicyProposal continues to learn like nothing happened. ## The WorkloadPolicy resource @@ -97,7 +102,7 @@ Policies are defined using the WorkloadPolicy resource. This is how this resourc apiVersion: security.rancher.io/v1alpha1 kind: WorkloadPolicy metadata: - name: deploy-pgsql-8646457455 + name: statefulsets-pgsql namespace: default spec: mode: monitor # monitor | protect @@ -122,22 +127,24 @@ spec: Changes compared to the current implementation: -- The rules section has been replaced by rulesByContainer. This new field holds a map with the name of the containers as key, and the list of the container rules as value. +- The rules section has been replaced by `rulesByContainer`. This new field holds a map with the name of the containers as key, and the list of the container rules as value. +- The `WorkloadPolicy` does not have the label selector field to identify the pods to protect. Notes on the behavior: - When the enforced workload is deleted, the WorkloadPolicy is still alive; it should be deleted manually +- When a `WorkloadPolicy` is deleted, we will implement a mutating admission controller that will prevent users to delete such a policy if it is referenced by any workload/pod - In case of workload rollout, the WorkloadPolicy remains unchanged. If it causes issues with the rollout, the user is in charge of rolling back to the previous version or destroying the policy ## Binding a WorkloadPolicy -A workload is protected by a WorkloadPolicy through a podSelector. We suggest the usage of a unique label security.rancher.io/policy, but we don’t enforce it by default since putting it in the spec.template would cause a rollout. +A workload is protected by a WorkloadPolicy the usage of a unique label `security.rancher.io/policy: `. + +When the label is applied, a rollout could be triggered as follows: - Basic user -> use default k8s workload selectors -> everything works out of the box, no rollout required. - Advanced user (real production scenario) -> enforce a unique label on workloads and use this label as a selector -> a rollout could be required if the workload was initially created without the label -Since the label is not compulsory, we cannot rely on it to understand if a workload is covered or not; we should use a kubectl plugin that scrapes the resources and helps the user to understand the situation (potential conflict, partial workload coverage,...). - -Users can still rely on the unique label if they choose to use it, and so simple kubectl commands. Our kubectl plugin should be generic and also cover cases where the label is not used. +Since the label is mandatory, we can rely on it to understand if a workload is covered by a policy or not. ## Using the ImagePolicy to inherit rules from pre-made templates @@ -267,6 +274,36 @@ status: At this stage we don't want to commit on the name of the WorkloadPolicyTuning resource as we might come up with a better name later, and we will for sure revisit at least the naming of the resource. We decided to defer that to a dedicated RFC when we get to implement tuning for policies. +## Tetragon integration strategy + +The current integration strategy between our policy CRDs and tetragon’s `TracingPolicyNamespaced` stays the same. + +Let’s go through all the possible cases, considering the current architecture of Tetragon. + +The user creates a `WorkloadSecurityPolicy` named `pgsql` inside of the infra namespace. + +Our controller will examine the policy and, for each container rule it will create a tetragon `TracingPolicyNamespaced` inside of the infra namespace. + +The tetragon policy will identify the containers by using two information: + +- Identify the pod by using the `security.rancher.io/policy: ` label. In this case, `security.rancher.io/policy:pgsql`. +- Identify the container by using the name of the container mentioned inside of the `.spec.rulesByContainer.[]` + +Depending on the mode of the WorkloadPolicy, we will reconcile a different type of tetragon policy, like we’re currently doing. At this point, the job or our controller is done. + +The Tetragon policy will stay “dormant” until a user assigns the special `security.rancher.io/policy: ` to their workload. + +We’re currently discussing with Tetragon maintainers to revisit how policies can be defined, to make them more “workload centric”. The work with upstream began before we did this refinement of our CRDs. Nevertheless, the proposal we made upstream remains valid also with this new set of CRDs and workflow. + +## Transitions + +These are the transitions that a policy will go through: + +- Learn -> Monitor: given a `WorkloadPolicyProposal` that correctly learned a workload's behavior, the user applies a label to mark it as ready to be deployed. The `WorkloadPolicyProposal` gets deleted and a `WorkloadPolicy` with the corresponding behavior and the mode set to `monitor` gets created. +- Monitor -> Protect: given a `WorkloadPolicy` with `mode: monitor`, the user just modifies the resource setting `monitor: protect`. +- Protect -> Monitor: given a `WorkloadPolicy` with `mode: protect`, the user just modifies the resource setting `mode: monitor`. +- Protect -> Learn: given a `WorkloadPolicy` with `mode: protect`, it will be sufficient to delete it. Subsequently, a `WorkloadPolicyProposal` will be created from scratch. + # Drawbacks [drawbacks]: #drawbacks