Skip to content

uprobes/x86: Fix red zone issue for optimized uprobes#7860

Open
kernel-patches-daemon-bpf-rc[bot] wants to merge 14 commits into
bpf-next_basefrom
series/1101263=>bpf-next
Open

uprobes/x86: Fix red zone issue for optimized uprobes#7860
kernel-patches-daemon-bpf-rc[bot] wants to merge 14 commits into
bpf-next_basefrom
series/1101263=>bpf-next

Conversation

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown

Pull request for series with
subject: uprobes/x86: Fix red zone issue for optimized uprobes
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 8496d90
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 8496d90
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 8496d90
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: e42e53a
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: e42e53a
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: e42e53a
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: be4c6c7
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: b23705e
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: a4a5d4e
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 7f9ce28
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 9b435d2
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 5b03831
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: c49f336
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 1444ee8
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 63a6f3b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 50dff00
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: b9452b5
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: dd0f968
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: f1a660b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 68f4e48
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

Kernel Patches Daemon and others added 11 commits June 9, 2026 12:49
In the unregister path we use __in_uprobe_trampoline check with
current->mm for the VMA lookup, which is wrong, because we are
in the tracer context, not the traced process.

Add mm_struct pointer argument to __in_uprobe_trampoline and
changing related callers to pass proper mm_struct pointer.

Fixes: ba2bfc9 ("uprobes/x86: Add support to optimize uprobes")
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Removing struct uprobe_trampoline object and it's tracking code,
because it's not needed. We can do same thing directly on top of
struct vm_area_struct objects.

This makes the code simpler and allows easy propagation of the
trampoline vma object into child process in following change.

Note the original code called destroy_uprobe_trampoline if the
optimiation failed, but it only freed the struct uprobe_trampoline
object, not the vma. The new vma leak is fixed in following change.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
When we do fork or clone without CLONE_VM the new process won't
have uprobe trampoline vma objects and at the same time it will
have optimized code calling that trampoline and crash.

Fixing this by allowing vma uprobe trampoline objects to be copied
on fork to the new process.

Fixes: ba2bfc9 ("uprobes/x86: Add support to optimize uprobes")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
In case the optimization fails, we leak new-ly created trampoline
vma mapping (in case we just created it), let's unmap it.

Fixes: ba2bfc9 ("uprobes/x86: Add support to optimize uprobes")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Andrii reported an issue with optimized uprobes [1] that can clobber
redzone area with call instruction storing return address on stack
where user code may keep temporary data without adjusting rsp.

Fixing this by moving the optimized uprobes on top of 10-bytes nop
instruction, so we can squeeze another instruction to escape the
redzone area before doing the call, like:

  lea -0x80(%rsp), %rsp
  call tramp

Note the lea instruction is used to adjust the rsp register without
changing the flags.

We use nop10 and following transformation to optimized instructions
above and back as suggested by Peterz [2].

Optimize path (int3_update_optimize):

  1) Initial state after set_swbp() installed the uprobe:
      cc 2e 0f 1f 84 00 00 00 00 00

     From offset 0 this is INT3 followed by the tail of the original
     10-byte NOP.

     After a previous unoptimization bytes 5..9 may still contain the
     old call instruction, which remains valid for threads already there.

  2) Rewrite the LEA tail and call displacement:
      cc [8d 64 24 80 e8 d0 d1 d2 d3]

     From offset 0 this traps on the uprobe INT3.  Bytes 1..9 are not
     executable entry points while byte 0 is trapped.

  3) Publish the first LEA byte:
      [48] 8d 64 24 80 e8 d0 d1 d2 d3

     From offset 0 this is:
        lea -0x80(%rsp), %rsp
        call <uprobe-trampoline>

Unoptimize path (int3_update_unoptimize):

  1) Initial optimized state:
      48 8d 64 24 80 e8 d0 d1 d2 d3
     Same as 3) above.

  2) Trap new entries before restoring the NOP bytes:
      [cc] 8d 64 24 80 e8 d0 d1 d2 d3

     From offset 0 this traps. A thread that had already executed the
     LEA can still reach the intact CALL at offset 5.

  3) Restore bytes 1..4 of the original NOP while keeping byte 0 trapped
     and byte 5 as CALL.
      cc [2e 0f 1f 84] e8 d0 d1 d2 d3

     From offset 0 this still traps. Offset 5 is still the CALL for any
     thread that was already past the first LEA byte.

  4) Publish the first byte of the original NOP:
      [66] 2e 0f 1f 84 e8 d0 d1 d2 d3

     From offset 0 this is the restored 10-byte NOP; the CALL opcode and
     displacement are now only NOP operands.  Offset 5 still decodes as
     CALL for a thread that was already there.

     Tthere is only a single target uprobe-trampoline for the given nop10
     instruction address, so the CALL instruction will not be changed across
     unoptimization/optimization cycles.
     Therefore, any task that is preempted at the CALL instruction is guaranteed
     to observe that CALL and not anything else.

Note as explained in [2] we need to use following nop10:
       PF1   PF2   ESC   NOPL  MOD   SIB   DISP32
NOP10: 0x66, 0x2e, 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00 -- cs nopw 0x00000000(%rax,%rax,1)

which means we need to allow 0x2e prefix which maps to INAT_PFX_CS
attribute in is_prefix_bad function.

Also changing the uprobe syscall error when called out of uprobe
trampoline to -EPROTO, so we are able to detect the fixed kernel.

The optimized uprobe performance stays the same:

        uprobe-nop     :    3.129 ± 0.013M/s
        uprobe-push    :    3.045 ± 0.006M/s
        uprobe-ret     :    1.095 ± 0.004M/s
  -->   uprobe-nop10   :    7.170 ± 0.020M/s
        uretprobe-nop  :    2.143 ± 0.021M/s
        uretprobe-push :    2.090 ± 0.000M/s
        uretprobe-ret  :    0.942 ± 0.000M/s
  -->   uretprobe-nop10:    3.381 ± 0.003M/s
        usdt-nop       :    3.245 ± 0.004M/s
  -->   usdt-nop10     :    7.256 ± 0.023M/s

[1] https://lore.kernel.org/bpf/20260509003146.976844-1-andrii@kernel.org/
[2] https://lore.kernel.org/bpf/20260518104306.GU3102624@noisy.programming.kicks-ass.net/#t
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Closes: https://lore.kernel.org/bpf/20260509003146.976844-1-andrii@kernel.org/
Fixes: ba2bfc9 ("uprobes/x86: Add support to optimize uprobes")
Assisted-by: Codex:GPT-5.5
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
We now expect nop combo with 10 bytes nop instead of 5 bytes nop,
fixing has_nop_combo to reflect that.

Fixes: 41a5c7d ("libbpf: Add support to detect nop,nop5 instructions combo for usdt probe")
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
In the previous optimized uprobe fix we changed the syscall
error used for its detection from ENXIO to EPROTO.

Changing related probe_uprobe_syscall detection check.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Fixes: 05738da ("libbpf: Add uprobe syscall feature detection")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Syncing latest usdt.h change [1].

Now that we have nop10 optimization support in kernel, let's emit
nop,nop10 for usdt probe. We leave it up to the library to use
desirable nop instruction.

[1] TBD
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Optimized uprobes are now on top of 10-bytes nop instructions,
reflect that in existing tests.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Changing uprobe/usdt trigger bench code to use nop10 instead
of nop5. Also changing run_bench_uprobes.sh to use nop10 triggers.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: c15261b
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1101263
version: 1

olsajiri and others added 3 commits June 9, 2026 12:52
Adding reattach tests for uprobe syscall tests to make sure
we can re-attach and optimize same uprobe multiple times.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
The uprobe nop5 optimization used to replace a 5-byte NOP with a 5-byte
CALL to a trampoline. The CALL pushes a return address onto the stack at
[rsp-8], clobbering whatever was stored there.

On x86-64, the red zone is the 128 bytes below rsp that user code may use
for temporary storage without adjusting rsp. Compilers can place USDT
argument operands there, generating specs like "8@-8(%rbp)" when rbp ==
rsp. With the CALL-based optimization, the return address overwrites that
argument before the BPF-side USDT argument fetch runs.

Add two tests for this case. The uprobe_syscall subtest stores known values
at -8(%rsp), -16(%rsp), and -24(%rsp), executes an optimized nop10 uprobe,
and verifies the red-zone data is still intact. The USDT subtest triggers a
probe in a function where the compiler places three USDT operands in the
red zone and verifies that all 10 optimized invocations deliver the expected
argument values to BPF.

On an unfixed kernel, the first hit goes through the INT3 path and later
hits use the optimized CALL path, so the red-zone checks fail after
optimization.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
[ updates to use nop10 ]
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Adding tests for forked/cloned optimized uprobes and make
sure the child can properly execute optimized probe for
both fork (dups mm) and clone with CLONE_VM.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants