Skip to content

Conversation

@holyspectral
Copy link
Contributor

@holyspectral holyspectral commented Oct 23, 2025

Fixes #4204

Description

As mentioned in #4204, this PR makes multiple kprobe sensors to share their fmod_ret programs, as long as they attach to the same function, so we can allow more than 38+ overrides on lsm functions.

(update Dec. 11, 2025) The changes since draft PR

  • Refactored and moved most logic to pkg/sensors/program/
  • Introduced an override ID in both userspace and ebpf to make sure that an override action is only handled by correct override programs.
  • Fixed an issue in override_tasks map's lifecycle.

The changes:

  1. Now override programs are maintained separately from the kprobe sensors. By saying that, it means:
  • override_tasks becomes a shared global map.
  • override programs (both kprobe and fmod_ret) are created and maintained separately. So multiple kprobe sensors can share the same override program.
  1. The pinned map/programs's location will be as below:
/bpffs/tetragon/__override__
/bpffs/tetragon/__override__/kprobe
/bpffs/tetragon/__override__/kprobe/__x64_sys_symlinkat
/bpffs/tetragon/__override__/kprobe/__x64_sys_symlinkat/link_override
/bpffs/tetragon/__override__/kprobe/__x64_sys_symlinkat/prog_override
/bpffs/tetragon/__override__/kprobe/__x64_sys_execve
/bpffs/tetragon/__override__/kprobe/__x64_sys_execve/link_override
/bpffs/tetragon/__override__/kprobe/__x64_sys_execve/prog_override
/bpffs/tetragon/__override__/override_tasks
/bpffs/tetragon/__override__/fmod_ret
/bpffs/tetragon/__override__/fmod_ret/security_bprm_creds_for_exec
/bpffs/tetragon/__override__/fmod_ret/security_bprm_creds_for_exec/prog

Changelog

@holyspectral
Copy link
Contributor Author

@olsajiri may I have your feedback on this one? Thanks a lot.

@holyspectral holyspectral changed the title feat(sensors): support 38+ override on lsm funcs wip: feat(sensors): support 38+ override on lsm funcs Oct 23, 2025
@mtardy mtardy requested a review from olsajiri October 24, 2025 08:44
@mtardy mtardy added the release-note/minor This PR introduces a minor user-visible change label Oct 24, 2025
Copy link
Contributor

@olsajiri olsajiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, left few comments, thanks

if fmodret, ok = fmodretMap[attachFunc]; !ok {

fmodret = program.Builder(
path.Join(option.Config.HubbleLib, loadProgName),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we already add new program for this, could we move override programs into separated object?
that might speed up the load

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I think it makes a lot of sense to do this. Thanks!

"github.com/cilium/tetragon/pkg/sensors/program"
)

var fmodretMap map[string]*program.Program
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about something similar but bit more generic to cover also standard kprobe override programs.. there's no limitation for attached kprobe programs, but it would at least benefit from having just single copy of the program and reduce the footprint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good idea. Let me look into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue I noticed is that, if I'm not mistaken, for kprobe override programs, when multi-kprobe is enabled, each kprobe attach points actually share the same override_maps. Do you think we should change that and use per-hookpoint map instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hum not sure what you mean..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I wasn't aware that the creation and loading of maps and programs in multi kprobe flows are based on the data type of load.LoaderData. That makes sense now. Thanks a lot!

fmodret.PinPath = "fmod_ret/" + attachFunc

fmodretmap := program.MapBuilder("override_tasks", fmodret)
fmodretmap.PinPath = path.Join("fmod_ret/", attachFunc, "override_tasks")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd rather place it in some visibly special place like we do for base sensor, perhaps something like:

__override__/fmodret/security_xxx
__override__/fmodret/security_yyy
               ...
__override__/kprobe/ksys_read
__override__/kprobe/ksys_write
               ...
__override__/override_tasks

logger.GetLogger().Info("loading generic fmod ret program", "prog", load)

unload := func() {
deleteFmodRetProg(load.Attach)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use program.Program.unloaderOverride or add something in there so it gets removed during program's unload?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I didn't notice this function exists when I wrote the code. From its name it's definitely more suitable. Let me look into this.

@netlify
Copy link

netlify bot commented Nov 13, 2025

Deploy Preview for tetragon ready!

Name Link
🔨 Latest commit 0828a7f
🔍 Latest deploy log https://app.netlify.com/projects/tetragon/deploys/6940722f9bb4170008444051
😎 Deploy Preview https://deploy-preview-4244--tetragon.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@holyspectral
Copy link
Contributor Author

Changes since v1:

  1. Allow kprobe to use shared override programs.
  2. Instead of having one override_tasks map for each hook point, in v2 each override program shares the same override_tasks map. This helps to support multi-kprobe scenario.
  3. The pinned path of ebpf programs and maps will be as below:
/bpffs/tetragon/__override__
/bpffs/tetragon/__override__/kprobe
/bpffs/tetragon/__override__/kprobe/__x64_sys_symlinkat
/bpffs/tetragon/__override__/kprobe/__x64_sys_symlinkat/link_override
/bpffs/tetragon/__override__/kprobe/__x64_sys_symlinkat/prog_override
/bpffs/tetragon/__override__/kprobe/__x64_sys_execve
/bpffs/tetragon/__override__/kprobe/__x64_sys_execve/link_override
/bpffs/tetragon/__override__/kprobe/__x64_sys_execve/prog_override
/bpffs/tetragon/__override__/override_tasks
/bpffs/tetragon/__override__/fmod_ret
/bpffs/tetragon/__override__/fmod_ret/security_bprm_creds_for_exec
/bpffs/tetragon/__override__/fmod_ret/security_bprm_creds_for_exec/prog
  1. Make sure that unused override programs are cleaned up via
    unloaderOverride().

  2. Move bpf functions regarding overrides into another object file.

@holyspectral
Copy link
Contributor Author

I still keep this as draft because of an item that I'd like to discuss first. Say we have a scenario is like the below (keep in mind that in v2 we use a shared override_tasks map):

  • We have two tracing policies A & B.
    • PolicyA is hooked on sys_execve and policyB is on sys_symlinkat.
    • They both have their own override action in matchActions.
  • A process triggers the override action of policy A via sys_execve, and has an item inserted into the override_tasks map.
    • override_tasks[<pid_tgid>] = -EPERM
  • Before the program triggers the override program attached to sys_execve, a policy change on A happens and the override program on sys_execve is removed from another core.
    • the content of the map stays the same, override_tasks[<pid_tgid>] = -EPERM
    • but the override program on sys_execve is removed.
  • When the program triggers policy B via sys_symlinkat later, because override_tasks[<pid_tgid>] = -EPERM, the action will be denied unexpectedly, even if nothing is matched.

I can think of a few directions:

  1. Maybe this is not a real issue, the window is too short, or Tetragon has already handled this, so we don't have to care about it.

  2. Change the override_tasks to be a BPF_MAP_TYPE_HASH_OF_MAPS and let its inner map be associated to each hook points, so we can clean up the content of override_tasks map when we remove an override program. For example,

    override_tasks: {
        "syscall:sys_execve": <inner map like the current override_tasks map>,
        "syscall:sys_symlinkat": <inner map like the current override_tasks map>,
        "fmod_ret:security_bprm_creds_for_exec": <inner map like the current override_tasks map>,
    }
  1. We don't delete the override programs immediately when its policy is removed. This will give the override programs some time to remove the items.

@olsajiri do you think this is a real issue that we should address? I'd love to know your thoughts on this.

@olsajiri
Copy link
Contributor

I can think of a few directions:

1. Maybe this is not a real issue, the window is too short, or Tetragon has already handled this, so we don't have to care about it.

2. Change the `override_tasks` to be a BPF_MAP_TYPE_HASH_OF_MAPS and let its inner map be associated to each hook points, so we can clean up the content of override_tasks map when we remove an override program.  For example,
    override_tasks: {
        "syscall:sys_execve": <inner map like the current override_tasks map>,
        "syscall:sys_symlinkat": <inner map like the current override_tasks map>,
        "fmod_ret:security_bprm_creds_for_exec": <inner map like the current override_tasks map>,
    }
3. We don't delete the override programs immediately when its policy is removed.  This will give the override programs some time to remove the items.

@olsajiri do you think this is a real issue that we should address? I'd love to know your thoughts on this.

yea, that seems like a problem.. so at the moment override_task is program's map, so each override program has its own copy, I'd suggest to have some state of this change doing the same, and adding a change to single map on top of that

perhaps we could have policy id as part of the override_task value and have sensor unload to cleanup its records before it unloads the override program.. something like you suggest in 2) but not sure what's benefit of inner map

I'll check on your change in more detail but on first glance please try to split the change into more logical changes/commits, it's easier to review, thanks

@holyspectral
Copy link
Contributor Author

holyspectral commented Nov 19, 2025

perhaps we could have policy id as part of the override_task value and have sensor unload to cleanup its records before it unloads the override program.. something like you suggest in 2) but not sure what's benefit of inner map

Thanks. I think that makes sense. Let me see if I can add an extra field to the key or value of override_tasks map.

I'll check on your change in more detail but on first glance please try to split the change into more logical changes/commits, it's easier to review, thanks

I've split them into multiple commits. Please feel free to let me know if anything is not clear!

Copy link
Contributor

@olsajiri olsajiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, left few comments, please split the change into more commits in next version

if !ok {
return fmt.Errorf("progName %s not in collecition spec programs: %+v", progName, coll.Programs)
}
progSpec.AttachTo = load.Attach
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed for kprobe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question with new version..

}
if unloadFunc != nil {
load.unloaderOverride = &unloader.CustomUnloader{
UnloadFunc: unloadFunc,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we just set load.unloaderOverride in getOverrideProg ? then we could leave LoadFmodRetProgram as is

@holyspectral holyspectral marked this pull request as ready for review December 11, 2025 16:43
@holyspectral holyspectral requested a review from a team as a code owner December 11, 2025 16:43
@holyspectral holyspectral requested a review from tixxdz December 11, 2025 16:43
@holyspectral
Copy link
Contributor Author

Hi @olsajiri thanks for your patience. I've updated v3 in this PR and changed this PR as ready for review. The change since last time:

  • Refactored and moved most logic to pkg/sensors/program/
  • Introduced an override ID in both userspace and ebpf to make sure that an override action is only handled by correct override programs.
  • Fixed an issue in override_tasks map's lifecycle.

I'll appreciate any feedback on this.

@holyspectral holyspectral changed the title wip: feat(sensors): support 38+ override on lsm funcs feat(sensors): support 38+ override on lsm funcs Dec 11, 2025
@olsajiri
Copy link
Contributor

olsajiri commented Dec 15, 2025

Hi @olsajiri thanks for your patience. I've updated v3 in this PR and changed this PR as ready for review. The change since last time:

* Refactored and moved most logic to pkg/sensors/program/

* Introduced an override ID in both userspace and ebpf to make sure that an override action is only handled by correct override programs.

* Fixed an issue in override_tasks map's lifecycle.

I'll appreciate any feedback on this.

@holyspectral I'm checking on that, would you mind to rebase it? it'd be easier for me to run it, thanks

Move bpf functions regarding overrides into bpf_generic_override.o

Signed-off-by: Sam Wang (holyspectral) <[email protected]>
Provide a custom CustomUnloader, so custom actions can be performed
when a ebpf program is unloaded and unloaderOverride is called.

Signed-off-by: Sam Wang (holyspectral) <[email protected]>
1. Allow policies to share override programs as long as they share the
same attached functions.  This includes both programs based on kprobe
and fmod_ret.

2. The pinned path of ebpf programs and maps are moved to below
locations:

/bpffs/tetragon/__override__
/bpffs/tetragon/__override__/kprobe
/bpffs/tetragon/__override__/kprobe/__x64_sys_symlinkat
/bpffs/tetragon/__override__/kprobe/__x64_sys_symlinkat/link_override
/bpffs/tetragon/__override__/kprobe/__x64_sys_symlinkat/prog_override
/bpffs/tetragon/__override__/kprobe/__x64_sys_execve
/bpffs/tetragon/__override__/kprobe/__x64_sys_execve/link_override
/bpffs/tetragon/__override__/kprobe/__x64_sys_execve/prog_override
/bpffs/tetragon/__override__/override_tasks
/bpffs/tetragon/__override__/fmod_ret
/bpffs/tetragon/__override__/fmod_ret/security_bprm_creds_for_exec
/bpffs/tetragon/__override__/fmod_ret/security_bprm_creds_for_exec/prog

Signed-off-by: Sam Wang (holyspectral) <[email protected]>
Signed-off-by: Sam Wang (holyspectral) <[email protected]>
Signed-off-by: Sam Wang (holyspectral) <[email protected]>
For multi kprobe, the id is sent as part of cookies, so each kprobe
program can have their own id.
For single kprobe, the id is retrieved from EventConfig.

Signed-off-by: Sam Wang (holyspectral) <[email protected]>
Signed-off-by: Sam Wang (holyspectral) <[email protected]>
In the previous version, override_tasks is a shared map created via
MapBuilder.  This caused a problem during cleanup because these maps
can't track reference count across policies.

In this commit, the override_tasks map is changed to a global map.

Signed-off-by: Sam Wang (holyspectral) <[email protected]>
Signed-off-by: Sam Wang (holyspectral) <[email protected]>
@holyspectral
Copy link
Contributor Author

holyspectral commented Dec 16, 2025

Looks like there are issues from vmtests on older kernels. I will take a look.

Copy link
Contributor

@olsajiri olsajiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first few changes look good but then I fail to see the reason for some of the following changes, I left some comments, please check

I think last 3 commits are fixes of your previous commits and should be squash where they belong

if !ok {
return fmt.Errorf("progName %s not in collecition spec programs: %+v", progName, coll.Programs)
}
progSpec.AttachTo = load.Attach
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question with new version..

}

func LoadFmodRetProgram(bpfDir string, load *Program, maps []*Map, progName string, verbose int) error {
func LoadKProbeOverrideProgram(bpfDir string, load *Program, maps []*Map, progName string, verbose int, unloadFunc func(bool) error) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this added when it's completely removed in one of the following commit?
also the commit log mentions only unloader, not this function

progs, maps = createKProbeOverrideProgramFromEntry(load, gk.funcName, progs, maps)
}
} else {
overrideTasksMap := program.MapBuilderProgram("override_tasks", load)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems wrong.. if we have load.override we want to ping the override_tasks map, right? you're doing it for the !load.override case

}
gk.data = &genericKprobeData{}

progs, maps = createKProbeOverrideProgramFromEntry(load, gk.funcName, progs, maps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have the load.OverrideFmodRet check in here and call createFmodRetOverrideProgramFromEntry?

return loadGenericFmodRetProgram(args.BPFDir, args.Load, args.Maps, args.Verbose)
}

func (k *kprobeOverrideProgram) LoadProbe(args sensors.LoadProbeArgs) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so how do we know this gets called before the main program load to ensure it's called in the right order?


if overrideProgMap == nil {
overrideProgMap = make(map[string]*Program)
overrideProgMap = make(map[string]*genericOverride)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs more explanation on why is this needed.. please use commit log message to describe what you are doing and why.. for now and for future ;-)

FUNC_INLINE __u32 get_index(void *ctx)
{
return (__u32)get_attach_cookie(ctx);
return (__u32)(get_attach_cookie(ctx) & 0x0f);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's real scary ;-)

again, please explain in the commit log message what you are doing and why.. I fail to see why this is needed

: "+r"(x)); \
})

FUNC_INLINE int try_override(void *ctx, struct bpf_map_def *override_tasks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make the change frst before moving the function, it's hard to review what changed

load.MapLoad = append(load.MapLoad, config)

// 0-3: index, 4-11: override_id, others: reserved
cookies := index&0x0f | (gk.loadArgs.overrideID << 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the use case for this? please explain

return nil
}

func cleanupPendingDeletionOverrideMap(id int) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be part of some of the previous commits right?

@mtardy
Copy link
Member

mtardy commented Dec 22, 2025

Could you reorganize your PR commits, squash the things that need to be squashed (lints, back and forth change in the code), separate the others correctly? It's a bit hard to review as of now, I see Jiri proposed some ideas. Also some commits seems misstitled ("kprobe: support 38+ override on lsm funcs"). It should appear in the patch set how we should merge it to the main branch. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note/minor This PR introduces a minor user-visible change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants