Skip to content

bpf, sockmap: take the socket lock in udp_bpf_recvmsg()#8038

Open
kernel-patches-daemon-bpf-rc[bot] wants to merge 1 commit into
bpf_basefrom
series/1109482=>bpf
Open

bpf, sockmap: take the socket lock in udp_bpf_recvmsg()#8038
kernel-patches-daemon-bpf-rc[bot] wants to merge 1 commit into
bpf_basefrom
series/1109482=>bpf

Conversation

@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown

Pull request for series with
subject: bpf, sockmap: take the socket lock in udp_bpf_recvmsg()
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1109482

udp_bpf_recvmsg() drains the psock ingress_msg queue through
sk_msg_recvmsg() without holding the socket lock. __sk_msg_recvmsg()
walks the head sk_msg's scatterlist and updates each entry in place
(sge->length -= copy, sge->offset += copy), drops the page once an entry
is consumed, and frees the sk_msg when it is drained. ingress_lock only
serializes the queue list operations, not this per-entry consume;
copy_page_to_iter() can fault, so the loop cannot hold a spinlock.

Two threads calling recvmsg() on the same UDP socket then consume the
same sk_msg at once:

  CPU0                                CPU1

  sk_msg_recvmsg                      sk_msg_recvmsg
    copy = sge->length                 copy = sge->length
    copy_page_to_iter(copy)            copy_page_to_iter(copy)
    sge->length -= copy
      /* sge->length == 0 */
                                       sge->length -= copy
                                         /* underflows to ~0U */

The next pass loads the underflowed length into a signed int, the value
is negative so the copied + copy > len clamp is skipped, and
copy_page_to_iter() is handed ~2^64 as its size_t bytes, which trips
page_copy_sane(). The same race can put_page() one page twice and free
the sk_msg while the other reader still walks it.

  WARNING: CPU: 0 PID: 5484 at lib/iov_iter.c:356 copy_page_to_iter
  WARNING: CPU: 1 PID: 5485 at lib/iov_iter.c:356 copy_page_to_iter
  CPU: 1 UID: 0 PID: 5485 Comm: syz.6.1204 Not tainted 7.1.0-rc7 #1
  RIP: 0010:copy_page_to_iter+0xa9/0x270
  Call Trace:
   <TASK>
   __sk_msg_recvmsg+0x384/0x9a0
   udp_bpf_recvmsg+0x1a0/0x6f0
   sock_recvmsg+0x124/0x150
   __sys_recvfrom+0x1c2/0x2b0
   __x64_sys_recvfrom+0x7b/0x90
   do_syscall_64+0x115/0x6d0
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
   </TASK>

tcp_bpf_recvmsg_parser() already serializes this with lock_sock() around
__sk_msg_recvmsg(); udp_bpf_recvmsg() does not. Take the socket lock
around the sk_msg_recvmsg() consume so a given ingress sk_msg is drained
by one reader at a time. The wait path is left unlocked, as
udp_msg_wait_data() sleeps without dropping the socket lock.

Fixes: 1f5be6b ("udp: Implement udp_bpf_recvmsg() for sockmap")
Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
@kernel-patches-daemon-bpf-rc

Copy link
Copy Markdown
Author

Upstream branch: e7ae89a
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1109482
version: 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant