Skip to content

pkg/aflow/flow: LLM confused by NULL deref under KASAN #6647

@dvyukov

Description

@dvyukov

LLM does not understand NULL deref under KASAN.
We can either teach it about it with instructions, or clean the report and remove registers/objdump/etc.

Okay, so I'm looking at this kernel crash, a `general protection fault` with a `KASAN: null-ptr-deref` in `rose_transmit_link` at `net/rose/rose_link.c:266`. The core problem seems to be the instruction `movzbl (%rax,%r14,1),%eax`, right at `rose_transmit_link+0x2a`.  Immediately, the registers draw my attention: `RAX` is `0x6`, and `R14` is that KASAN-quarantined address `0xdffffc0000000000`. That screams "invalid memory access" to me.

The effective address being accessed is clearly `R14 + RAX`, which is `0xdffffc0000000006`. This exactly matches the reported crash address.  However, this `R14` value is strange because it is not NULL, but a KASAN quarantined address.

The KASAN report adds a wrinkle: it flags `null-ptr-deref in range [0x0000000000000030-0x0000000000000037]`. This normally means an access happened at an offset from a null pointer within this range, but that's seemingly contradicted by our direct `R14` value.

So, let's step through the code leading up to the crash. The instruction `movzbl (%rax,%r14,1),%eax` is at `rose_transmit_link+0x32`. Before this we see:
`lea    0x36(%rbp),%r15`
`mov    %r15,%rax`
`shr    $0x3,%rax`

This means that `RAX` is effectively calculated based on `RBP`. With `RBP` is `0x0000000000000000`. Therefore: `R15 = 0x0 + 0x36 = 0x36`, and `RAX = 0x36 >> 3 = 0x6`. And there it is, `RAX` matches the register dump. So the effective address being accessed in the `movzbl` instruction is `R14 + 6 = 0xdffffc0000000006`, and that's the crash.  But now, why is `R14` the KASAN quarantined value when the KASAN message references a null pointer dereference?

The instruction can be read as `*(u8 *)(R14 + RAX)`. With `RAX` as `0x6`, the access is `*(u8 *)(R14 + 6)`.  The fact that `R14` is `0xdffffc0000000000` is the key. KASAN is telling us there's an invalid memory access. The `null-ptr-deref` message suggests that the base pointer was likely `NULL`. But if the base pointer was `NULL`, then `R14` shouldn't be a KASAN quarantined address, right?

I think the issue is the interaction between `R14` and `RAX`. There has been a dereference of what was supposed to be a valid pointer but is now a KASAN quarantine. What I need now is the source code for the `rose_transmit_link` function to see what `R14` *should* be and how it's being used to cause this access. It seems as though there is a memory corruption somewhere, where something intended to be a valid memory address, has become a KASAN quarantined area and triggered the crash.
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
CPU: 1 UID: 0 PID: 6005 Comm: syz-executor274 Not tainted syzkaller #1 PREEMPT(full) 
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:rose_transmit_link+0x32/0x740 net/rose/rose_link.c:266
Code: 56 41 55 41 54 53 48 83 ec 18 48 89 f5 49 89 fc 49 be 00 00 00 00 00 fc ff df e8 a9 65 81 f7 4c 8d 7d 36 4c 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 0f 85 6e 05 00 00 41 0f b6 1f 31 ff 89 de e8
RSP: 0018:ffffc90002e27770 EFLAGS: 00010217
RAX: 0000000000000006 RBX: 0000000000000000 RCX: ffff888021efdb80
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88801f6b2780
RBP: 0000000000000000 R08: ffff888021efdb80 R09: 0000000000000008
R10: 000000000000000f R11: 0000000000000000 R12: ffff88801f6b2780
R13: dffffc0000000000 R14: dffffc0000000000 R15: 0000000000000036
FS:  0000000000000000(0000) GS:ffff8880ec4a9000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f00a978f300 CR3: 000000000df3c000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
 <TASK>
 rose_write_internal+0x11df/0x1ac0 net/rose/rose_subr.c:198
 rose_release+0x25b/0x510 net/rose/af_rose.c:671
 __sock_release net/socket.c:662 [inline]
 sock_close+0xc3/0x240 net/socket.c:1455
 __fput+0x44f/0xa70 fs/file_table.c:468
 task_work_run+0x1d4/0x260 kernel/task_work.c:227
 exit_task_work include/linux/task_work.h:40 [inline]
 do_exit+0x6b5/0x2300 kernel/exit.c:966
 do_group_exit+0x21c/0x2d0 kernel/exit.c:1107
 get_signal+0x12aa/0x1350 kernel/signal.c:3034
 arch_do_signal_or_restart+0xa0/0x790 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop+0x72/0x130 kernel/entry/common.c:40
 exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
 syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
 do_syscall_64+0x2b3/0xfa0 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3e9819efc9
Code: Unable to access opcode bytes at 0x7f3e9819ef9f.
RSP: 002b:00007f3e9816c218 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: 0000000000000000 RBX: 00007f3e98249b88 RCX: 00007f3e9819efc9
RDX: 000000000000001c RSI: 0000200000000040 RDI: 0000000000000007
RBP: 00007f3e98249b80 R08: 32203a6b63617473 R09: 32203a6b63617473
R10: 32203a6b63617473 R11: 0000000000000246 R12: 00007f3e98216a84
R13: 0000200000000040 R14: 0000200000000380 R15: 0000200000000240
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:rose_transmit_link+0x32/0x740 net/rose/rose_link.c:266
Code: 56 41 55 41 54 53 48 83 ec 18 48 89 f5 49 89 fc 49 be 00 00 00 00 00 fc ff df e8 a9 65 81 f7 4c 8d 7d 36 4c 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 0f 85 6e 05 00 00 41 0f b6 1f 31 ff 89 de e8
RSP: 0018:ffffc90002e27770 EFLAGS: 00010217
RAX: 0000000000000006 RBX: 0000000000000000 RCX: ffff888021efdb80
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88801f6b2780
RBP: 0000000000000000 R08: ffff888021efdb80 R09: 0000000000000008
R10: 000000000000000f R11: 0000000000000000 R12: ffff88801f6b2780
R13: dffffc0000000000 R14: dffffc0000000000 R15: 0000000000000036
FS:  0000000000000000(0000) GS:ffff8880ec4a9000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f00a978f300 CR3: 000000000df3c000 CR4: 0000000000750ef0
PKRU: 55555554
----------------
Code disassembly (best guess):
   0:	56                   	push   %rsi
   1:	41 55                	push   %r13
   3:	41 54                	push   %r12
   5:	53                   	push   %rbx
   6:	48 83 ec 18          	sub    $0x18,%rsp
   a:	48 89 f5             	mov    %rsi,%rbp
   d:	49 89 fc             	mov    %rdi,%r12
  10:	49 be 00 00 00 00 00 	movabs $0xdffffc0000000000,%r14
  17:	fc ff df
  1a:	e8 a9 65 81 f7       	call   0xf78165c8
  1f:	4c 8d 7d 36          	lea    0x36(%rbp),%r15
  23:	4c 89 f8             	mov    %r15,%rax
  26:	48 c1 e8 03          	shr    $0x3,%rax
* 2a:	42 0f b6 04 30       	movzbl (%rax,%r14,1),%eax <-- trapping instruction
  2f:	84 c0                	test   %al,%al
  31:	0f 85 6e 05 00 00    	jne    0x5a5
  37:	41 0f b6 1f          	movzbl (%r15),%ebx
  3b:	31 ff                	xor    %edi,%edi
  3d:	89 de                	mov    %ebx,%esi
  3f:	e8                   	.byte 0xe8

Metadata

Metadata

Assignees

No one assigned

    Labels

    AI patchingFeature requests and bugs related to AI-based kernel bug fix generation.enhancement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions