-
Notifications
You must be signed in to change notification settings - Fork 474
PoC: Templated Policies for Reduced Memory and eBPF Program Count #4279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Andrea Terzolo <[email protected]>
| if (!policy_filter_check(config->policy_id)) | ||
| return 0; | ||
| // todo: this should replace the policy filter check above | ||
| if (config->cgroup_filter && !get_policy_from_cgroup()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like a duplicate logic of the above policy_filter_check(). Instead of hardcoding a policy value inside the policy config and using it as a key for a global BPF_MAP_TYPE_HASH_OF_MAPS, we could just use a simple hashmap BPF_MAP_TYPE_HASH and configure at runtime the policy_id associated with the groups.
If we deploy a k8s-aware policy (without bindings) and we assign it a policy_id=4
kind: TracingPolicy
metadata:
name: "lseek-podfilter"
spec:
podSelector:
matchLabels:
app: "lseek-test"
kprobes:
- call: "sys_lseek"
syscall: true
args:
- index: 0
type: "int"
selectors:
- matchArgs:
- index: 0
operator: "Equal"
values:
- "-1"
matchActions:
- action: SigkillWe can translate the pod selectors into a simple hash map identical to the one introduced in this patch
cgroup_id1 -> 4
cgroup_id2 -> 4
So instead of starting from the policy_id to get a HashMap like
cgroup_id1 -> 1
cgroup_id2 -> 1
We immediately jump inside the hash map with the cgroup-id, and we can recover the policy_id number from there.
WDYT?
| DEFINE_ARRAY_OF_STRING_MAPS(10) | ||
| #endif | ||
|
|
||
| #define POLICY_STR_OUTER_MAX_ENTRIES 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw some discussion here, #1408 about having one single map instead of 11 different maps. We can also accept the performance loss and use just one unique shared map to simplify things
| PolicyID uint32 `align:"policy_id"` | ||
| Flags uint32 `align:"flags"` | ||
| Pad uint32 `align:"pad"` | ||
| CgroupFilter uint32 `align:"cgroup_filter"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just an hack to enable/disable the new logic
| "github.com/cilium/tetragon/pkg/labels" | ||
| ) | ||
|
|
||
| type policy interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today, I created an interface to reuse as much code as I could from the existing policy filter. If we unify the logics ebpf side https://github.com/cilium/tetragon/pull/4279/files#r2481227336, we can probably use just one unique policy type without an interface
|
|
||
| const templateValue = "*" | ||
|
|
||
| func checkTemplateValueIsValid(arg *v1alpha1.ArgSelector, op uint32, ty uint32) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly highlight the use cases supported
| pi.policyStringHash = make([]*program.Map, 0, numSubMaps) | ||
|
|
||
| for stringMapIndex := range numSubMaps { | ||
| policyStrMap := program.MapBuilderPolicy(fmt.Sprintf("%s_%d", sensors.PolicyStringHashMapPrefix, stringMapIndex), prog) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today I used a MapBuilderPolicy to simplify the code, but if in the future we want to support multiple bindings per policy, this cannot be a map shared by all selectors of the policy.
|
Thanks! I've raised a point in the original issue, and I'm not sure if it's addressed here. What happens if the same workload is matched by multiple templates? I'm guessing the answer is somewhere, and I'm probably missing it. |
|
See this CFP cilium/design-cfps#80 |
This PR introduces a Proof of Concept to address the issues discussed in #4191.
This approach attempts to solve the two main problems described in the issue:
The primary use case is deploying a distinct policy for each K8s workload where the sensors and filters are identical, but the specific values being enforced (e.g., a list of binaries) differ for each workload.
Warning
poc/directory in this branch contains sample YAML files and a README.md to help test and understand this approach.Ideal Design Explanation
Let's start from the ideal solution we have in mind, and then let's see how this is translated into the POC.
The proposed solution is based on two core concepts: "Templates" and "Bindings".
Template
A "template" is a
TracingPolicythat specifies variables which can be populated at runtime, rather than being hardcoded at load time. Selectors within the policy reference these variables by name.When a template policy is deployed, it loads the necessary eBPF programs and maps, but it has no runtime effect because it lacks concrete values for its comparisons.
Binding
A "binding" is a new resource (e.g.,
TracingPolicyBinding) that provides concrete values for a template's variables and applies them to specific workloads.The policy logic becomes active only when a
TracingPolicyBindingis deployed. This action populates the template's eBPF maps with the specified values for the cgroups matching the podSelector.POC Implementation
To minimize changes for this POC, we reuse the existing
TracingPolicyresource and itsOptionSpecto simulate both templates and bindings.Template: A template is defined as a TracingPolicy using these options:
Binding: A binding is also a TracingPolicy (which would ideally be a TracingPolicyBinding) that references the template and provides values. This POC currently supports only one binding.
Details
TracingPolicyis deployed, the eBPF programs and maps are loaded.BPF_MAP_TYPE_HASH,cg_to_policy_mapis introduced. It stores a mapping fromcgroupid-> policy_id. This allows us to look up a policy ID from a cgroupid, which is the reverse of the currentpolicy_filter_cgroup_maps(aBPF_MAP_TYPE_HASH_OF_MAPS).policy_id.cgroupid->policy_id) is added to thecg_to_policy_map.BPF_MAP_TYPE_HASH_OF_MAPSare used:pol_str_maps_*. This implementation is very specific to string/charbuf/filename types and the eq/neq operators, but the concept can be extended to other types/operators, more on this later.policy_id(obtained fromcg_to_policy_map).string_maps_*.Note
A
cgroup_idcan only be associated with onepolicy_id(binding) at a time. A new binding for the same cgroup should either be rejected or overwrite the existing one. For example, bindingcgroup1to bothpolicy_1(values:/bin/ls) andpolicy_5(values:/bin/cat) simultaneously is not logical.Current Limitations & Hacks
pol_str_maps_*instead of a hardcoded value: we setvallen=8in theselector_arg_filter. I've to admit i've not verified this approach too much since i think this is not a sustainable solution but just works for the POC.Summary & Goals
This design provides a path toward achieving the two goals of the issue:
npolicies (e.g., 512-1024 or more), as they all reference the same template. This drastically reduces the number of eBPF programs loaded in the kernel.cg_to_policy_mapand thepol_str_maps_*(likely a few KB per policy, assuming non-massive value lists).