-
Notifications
You must be signed in to change notification settings - Fork 230
Description
As part of #562 I am working through the usage of unsafe code and I think I uncovered some UB in the pool module that affects Arc<T> and Box<T>. I would like it if someone could verify this as I am not entirely sure I understand the language documentation correctly.
There are two places which IMO produce UB, both relating to the usage of the UnionNode<T> type:
Obtaining an invalid reference to a union field in UnionNode<T>
First, this line right here in the implementation of the Node trait for UnionNode<T>:
impl<T> Node for UnionNode<T> {
type Data = T;
fn next(&self) -> &AtomicPtr<Self> {
unsafe { &self.next } // UB if `self.next` is not initialized, but we can't know from within this method
}
//...
}As per the Rustonomicon, a reference pointing to invalid data is instant UB (see here). To get a valid reference to the self.next field, it must be properly initialized. Looking at e.g. the arc module, the nodes come from the ArcBlock<T> type, which is initialized like this:
impl<T> ArcBlock<T> {
/// Creates a new memory block
pub const fn new() -> Self {
Self {
node: UnionNode {
data: ManuallyDrop::new(MaybeUninit::uninit()), // TODO I think we want to initialize `next` instead of `data`!
},
}
}
}Initializing data does not guarantee that next is also properly initialized, so I think the fix (as I put in the comments) would be to initialize self.next instead in ArcBlock::new(). As I see, we never need data to be initialized as its purpose is to be pushed into the stack of the ArcPoolImpl type, and once we pull it out of the stack we make sure data is initialized through a ptr::write.
The same should be done for BoxBlock::new() in the Box module. IMO it would also require to make the next() and next_mut() methods of the Node trait unsafe, as we otherwise cannot guarantee that obtaining a reference to the self.next field is sound.
Wrong assumptions on the memory layout of UnionNode<T>
The second problem is that there are several situations where a pointer to a UnionNode<T> is cast to T but we cannot be sure that it is valid to do so. An example is this line in ArcPool::alloc():
unsafe { node_ptr.as_ptr().cast::<ArcInner<T>>().write(inner) }The Rust reference on union field layout states that "Fields might have a non-zero offset (except when the C representation is used)". Currently UnionNode<T> is not repr(C), so assuming that a *mut UnionNode<T> is also a valid *mut T is wrong I think. The fix here might be as simple as adding #[repr(C)] to UnionNode<T>.
I would be happy to implement these changes, maybe as part of the PR for #562, if it turns out that I am right. If I'm wrong I am also thankful for anyone pointing out why :)