Skip to content

selftests/bpf: Add cgroup kptr NMI deadlock reproducer#8037

Open
kernel-patches-daemon-bpf-rc[bot] wants to merge 1 commit into
bpf-next_basefrom
series/1109417=>bpf-next
Open

selftests/bpf: Add cgroup kptr NMI deadlock reproducer#8037
kernel-patches-daemon-bpf-rc[bot] wants to merge 1 commit into
bpf-next_basefrom
series/1109417=>bpf-next

Conversation

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown

Pull request for series with
subject: selftests/bpf: Add cgroup kptr NMI deadlock reproducer
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1109417

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 2e8ad1f
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1109417
version: 1

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: 30dee2c
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1109417
version: 1

Exercise the path where a cgroup kptr stashed in a BPF map has its
destructor invoked from NMI context when the map element is freed.
bpf_cgroup_release_dtor -> cgroup_put can take sleeping/spinning paths
that are unsafe to enter from NMI; the destructor must be deferred
instead of run inline.

The test stashes a cgroup kptr from a syscall program into a HASH map
with BPF_F_NO_PREALLOC, then drives map element deletion from a
tp_btf/nmi_handler program firing on PMU cycle counter NMIs raised on a
pinned CPU. Each round:

  1. Creates a cgroup and stashes its kptr in the map.
  2. Removes the cgroup and waits for css_free_rwork_fn to fire for
     every subsystem CSS (tracked via an fexit program), so the kptr
     drop hits the window where the bug reproduces.
  3. Arms the NMI program (gated by an "nr_cgrps" counter) and waits
     for it to delete the stashed element.

After REPRO_ROUNDS iterations the test scans /dev/kmsg captured from
the start of the run: bpf_cgroup_release_dtor appearing in any splat
stack means the destructor ran inline from NMI and the fix has
regressed. The task variant only proved no hard hang; scanning kmsg
catches the bug even when the inline path does not actually wedge the
CPU.

This fails and causes a kernel splat prior to commit a3a81d2
("bpf: Cancel special fields on map value recycle") in bpf-next/master.

Runs on x86 only: relies on PMU cycle counter NMIs and the
x86-specific nmi_handler tracepoint.

Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Justin Suess <utilityemal77@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant