Introduce support for tight bounds on the kernel stack.#2573
Introduce support for tight bounds on the kernel stack.#2573
Conversation
|
Hm, I'm not immediately convinced the kernframe stuff is worth it |
I asked myself a similar question, hence the kernel option to enable/disable it. However, this is partially necessary for my local/global patches, where I'd like |
862de4d to
cb57508
Compare
This is used to enable tight bounds on data structures that share the kernel stack allocation, for example, struct pcb.
This enables tight bounds on the kernel stack sub-allocations. In particular, the pcb, kernframe and actual kernel stack are now fully separated. The td_kstack capability retains full bounds. This modifies the trap handlers to stash the kernframe structure in sscratchc, instead of the kernel stack pointer. The kernel stack and the pcpu capabilities are recovered from the kernframe structure, without assuming that out-of-bounds access is possible.
This simplifies the management of sscratch, leaving it unchanged. In user mode, sscratch still holds the full unbounded kstack (without the pcb) and the trap handler can use it to access kernframe and trapframe. Before entering the C exception handler, set narrower bounds on trapframe and the stack pointer installed in csp.
This is perhaps not the optimal place for these assertions, however these should hold every time we enter the kernel and check that the thread kstack context is in a consistent state.
This enables tight bounds on the kernel stack sub-allocations. In particular, the pcb, and actual kernel stack are now fully separated. The td_kstack capability retains full bounds. The exception handlers are modified to re-derive the trapframe and kernel stack capabilities from the root td_kstack capability.
cb57508 to
5cf5537
Compare
bsdjhb
left a comment
There was a problem hiding this comment.
Did you consider doing the upstream approach for struct pcb used on amd64 where it is now just part of struct mdthread as a md_pcb field? That could be upstreamed to FreeBSD as well which might reduce our diff and reduce the complexity of this change a bit by not having to worry about the pcb anymore?
sys/riscv/riscv/exception.S
Outdated
| * user stack pointer while we keep kernelframe in sscratchc. | ||
| */ | ||
| .if \mode == 0 | ||
| /* Stash user ctp in kframe stash and place kframe ptr in ctp */ |
There was a problem hiding this comment.
Why ctp rather than ct0 similar to the block of code above for hybrid kernels when deriving csp from ddc?
sys/riscv/riscv/exception.S
Outdated
| /* Stash user ctp in kframe stash and place kframe ptr in ctp */ | ||
| csc ctp, (KF_SCRATCH)(csp) | ||
| cmove ctp, csp | ||
| /* Fetch real kstack from kframe */ |
There was a problem hiding this comment.
Normal style(9) is a blank line before comments. We don't use them when there is already an effective break in the flow due to C preprocessor or assembly macro conditionals.
| KASSERT((uintptr_t)get_pcpu() >= VM_MIN_KERNEL_ADDRESS, | ||
| ("Invalid pcpu address from userland: %p (tpidr 0x%lx)", | ||
| get_pcpu(), READ_SPECIALREG(tpidr_el1))); | ||
| #ifdef CHERI_BOUNDED_KSTACK |
There was a problem hiding this comment.
Is this commit separate so that reviewers can evaluate both approaches? If so, if you end up choosing this version, please squash this down into the previous commit and revise the log message to reflect the end result. I'm not sure it's worth having the in-between stage in the history as-is.
| .macro load_registers mode | ||
| #ifdef CHERI_BOUNDED_KSTACK | ||
| .if \mode == 0 | ||
| /* |
There was a problem hiding this comment.
Did you consider just saving the full kstack cap you need in a new field in td_md that you can reload here so you don't have to do all this computation on each syscall exit, only when creating a new thread?
| #include <machine/trap.h> | ||
| #include <machine/riscvreg.h> | ||
|
|
||
| .macro save_registers mode |
There was a problem hiding this comment.
I would add a comment here above the start of the macro to state that in the CHERI_KSTACK_BOUNDS case it intentionally returns a bounded pointer to the created trapframe in cs0 for use by callers instead of documenting that in the callers.
| #if __has_feature(capabilities) | ||
| p2->p_md.md_sigcode = td1->td_proc->p_md.md_sigcode; | ||
| #endif | ||
| #ifdef __CHERI_PURE_CAPABILITY__ |
There was a problem hiding this comment.
Should this be part of the earlier commit that cleared the permissions? (And can we add a similar assert on Morello?)
The implementation differs slightly depending on the architecture.
RISC-V uses kstack for the pcb, kernframe and trapframe structures.
These patches set tight bounds for the pcb, kernframe and remainder of the kernel stack.
The trapframe is left together with the kernel stack, given that it is part of the normal arithmetic on the kernel stack pointer in the trap handlers at the moment.
The
sscratchcregister is now used to hold a pointer tostruct kernframeinstead of the fullkstackcapability.The kernframe is expanded to hold a bounded pointer to the kernel stack region and a scratch pointer.
The scratch pointer is used in the trap handler to swap register contents with constrained use of CPU registers.
Edit:
sscratchcor the existing stashed stack pointer. The following applies to all architectures:struct pcbbounds. Representability is guaranteed.struct pcbhas separate bounds that never overlap with the stack pointer.td_kstackis considered the root capability for all kernel stack sub-allocations.Note that this is all gated by the CHERI_BOUNDED_KSTACK option, because I'd like to be able to measure the difference for dissertation purposes. Also, this is an intermediate patch for the use of local/global for kernel capability flow enforcement, which is currently WIP.