-
Notifications
You must be signed in to change notification settings - Fork 474
tetragon: Add support to retrieve environment variables #4184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
0fbb328 to
9dd22a4
Compare
e35bac7 to
310e337
Compare
f906c12 to
6a4eec3
Compare
6a4eec3 to
8b4eaf6
Compare
FedeDP
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments but the PR is egregious :) Great job!
As the comment suggested the filename could be sent through data event, let's read the filename instead. Signed-off-by: Jiri Olsa <[email protected]>
We can return just proc object and error without the bool state. Making nopMsgProcess part of execParse init code so there's no longer need for the function. Signed-off-by: Jiri Olsa <[email protected]>
We are about to add another extra info to the exec event (environment variables), so instead of relying on separators lets add size for each data and allow the user space easier parsing. Signed-off-by: Jiri Olsa <[email protected]>
Let's share the functionality in single function. Signed-off-by: Jiri Olsa <[email protected]>
Adding test for execParse with api.EventErrorFilename and make sure we get proper results. Signed-off-by: Jiri Olsa <[email protected]>
Add test for execParse with just filename and make sure we get proper results. Signed-off-by: Jiri Olsa <[email protected]>
Adding support to read environment variables. It's guarded with tg_conf_map::env_vars_enabled variable and same as for arguments we either use space in the exec event or send it through separate data event. Signed-off-by: Mikita Iwanowski <[email protected]> Signed-off-by: Jiri Olsa <[email protected]>
It will store environment variables for the process. Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
8b4eaf6 to
f4de6a0
Compare
Adding parsing of the environment variables data from the exec event and the --enable-process-environment-variables option to enable this feature. Signed-off-by: Mikita Iwanowski <[email protected]> Signed-off-by: Jiri Olsa <[email protected]>
Add test for execParse with filename, args, cwd, envs and make sure we get proper results. Signed-off-by: Jiri Olsa <[email protected]>
Add support to redact data from environment variables. Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Adding docs for environment variables redaction. Signed-off-by: Jiri Olsa <[email protected]>
f4de6a0 to
b546118
Compare
kkourt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Please find some comments below.
| var desc dataapi.DataEventDesc | ||
|
|
||
| dr := bytes.NewReader(args) | ||
| if exec.SizePath != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there legitimate cases where this can be zero? (similarly for the other added sizes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, if probe read fails on the envs data, then EVENT_ERROR_ARGS should be set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we continue reading everything else if I understand correctly from the code?
| } | ||
|
|
||
| #ifdef __LARGE_BPF_PROG | ||
| FUNC_INLINE __u32 read_envs(void *ctx, struct msg_execve_event *event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work correctly for execs happening from inside the kernel (e.g, call_usrmodehelper and friends).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, environment is stored the same way, so we get it.. like:
{"process_exec":{"process":{"exec_id":"cWVtdS0xOjUyNzE3MDc3ODA3MjoxMjYxOA==","pid":12618,"uid":0,"cwd":"/","binary":"/usr/lib/systemd/systemd-coredump","arguments":"12617 1000 1000 11 1763675427 102400000000 qemu-1 1 3","flags":"execve rootcwd clone","start_time":"2025-11-20T21:50:27.329868374Z","auid":4294967295,"parent_exec_id":"cWVtdS0xOjI5NzAwMDAwMDA6Mg==","tid":12618,"in_init_tree":false,"environment_variables":"HOME=/ TERM=linux PATH=/sbin:/usr/sbin:/bin:/usr/bin"},"parent":{"exec_id":"cWVtdS0xOjI5NzAwMDAwMDA6Mg==","pid":2,"uid":0,"binary":"[kthreadd]","flags":"procFS","start_time":"2025-11-20T21:41:43.129089721Z","auid":4294967295,"parent_exec_id":"cWVtdS0xOjE6MA==","tid":2,"in_init_tree":false}},"node_name":"qemu-1","time":"2025-11-20T21:50:27.329867762Z"}
| // tool like nsenter or kubectl exec. | ||
| google.protobuf.BoolValue in_init_tree = 20; | ||
| // Environment variables passed to the binary at execution. | ||
| string environment_variables = 21; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Should we parse this in user-space and offer an array to the user?
E.g.,
message EnvEntry {
string Key = 1;
string Value = 2;
}
And return:
repeated EnvEntry environment_variables = 21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure.. as you already mentioned catching this impacts performance already,
also it's sort of similar to args string which we handle the same way together with the redaction support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but splitting a string in userspace (I'm guessing based on a ) character shouldn't have a big impact on performance, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm wondering too if it's not a nice UX improvement that is pretty cheap considering everything that we already do in this gRPC conversion. No strong feelings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will add that
| flags.Bool(KeyEnableProcessEnvironmentVariables, false, "Include environment variables in process_exec events. Disabled by default. Note that this option can significantly increase the size of the events and may impact performance") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a note about sensitive information in the flag.
E.g., "Include environment variables in process_exec events. Disabled by default. Note that this option can significantly increase the size of the events and may impact performance, as well as capture sensitive information such as passwords in the events"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"as well as capture sensitive information such as passwords in the events" might be super verbose but we can also mention the feature we have for redacting such content here.
mtardy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, here are a few comments, I think using size based parsing is cleaner indeed and easier to understand than searching for those NULL bytes.
| return size; | ||
| } | ||
|
|
||
| #ifdef __LARGE_BPF_PROG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are guarding this with __LARGE_BPF_PROG anyway, why can't we stop using tetragon_conf and use proper runtime constants https://ebpf-go.dev/concepts/global-variables/#runtime-constants ? (so that we benefit from dead code elimination etc.)
I think we wanted to switch existing code so that would be cool to not add more code using the legacy style if we can? wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably good idea :) will check, thanks
| // reuse p->args first string that contains the filename, this can't be | ||
| // above 256 in size (otherwise the complete will be send via data msg) | ||
| // which is okay because we need the 256 first bytes. | ||
| curr->bin.path_length = probe_read_str(curr->bin.path, BINARY_PATH_MAX_LEN, &p->args); | ||
| char *filename = (char *)ctx + (_(ctx->__data_loc_filename) & 0xFFFF); | ||
|
|
||
| curr->bin.path_length = probe_read_str(curr->bin.path, BINARY_PATH_MAX_LEN, (void *)filename); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I remember seeing this change in another patch set and couldn't get why doing that. So here I understand that it simplifies your changes from the next commit!
| // tool like nsenter or kubectl exec. | ||
| google.protobuf.BoolValue in_init_tree = 20; | ||
| // Environment variables passed to the binary at execution. | ||
| string environment_variables = 21; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm wondering too if it's not a nice UX improvement that is pretty cheap considering everything that we already do in this gRPC conversion. No strong feelings.
storing environment variables into process event data, enabled with
--enable-process-environment-variablesoption,and we allow to redact info from the variables
reworked a bit the exec data parsing and added the environment variables parsing contributed by @slntopp,
I left the SOB info in some changes, @slntopp please shout if you want to make some changes