Remove a potential source of UB from potential_utf#7871
Remove a potential source of UB from potential_utf#7871robertbastian merged 1 commit intounicode-org:mainfrom
Conversation
robertbastian
left a comment
There was a problem hiding this comment.
thanks for finding
please add a safety comment for the usage of Box::from_raw. I think you want to say that PotentialUtf8 and [u8] have the same memory layout, and both Box types use the same allocator
Manishearth
left a comment
There was a problem hiding this comment.
Yeah, this is a good idea, even though I agree that this is practically not a problem. If we asked, Rust would probably document this for us.
Still, for unsafe code, it is good to be technically correct as much as possible, for future-proofness and also to improve the quality of the code in the eyes of reviewers.
r+ with Robert's comments addressed.
This commit removes a transmute from `PotentialUtf8::from_boxed_bytes` as this might rely on undefined behaviour. It is only sound to transmute between two types like that if their layout is guranteed to be the same. For this particular case it's guranteed that `[u8]` and `PotentialUtf8` have the same layout via the `#[repr(transparent)]` attribute on `PotentialUtf8`. This gurantee doesn't seem to extend on `Box` as: * `Box` itself is not marked as `#[repr(transparent)]` (or `#[repr(C)]`) * The [memory layout](https://doc.rust-lang.org/std/boxed/index.html#memory-layout) section of the standard library documentation for `Box` only gurantees a certain layout for sized types: > So long as T: Sized, a Box<T> is guaranteed to be represented as a single pointer and is also ABI-compatible with C pointers (i.e. the C type T*). * This specific case involves unsized types, so that gurantee doesn't help us here. Given that it seems safer to just perform a pointer cast there instead which only involves types with a known compatible layout. Practically speaking I wouldn't expect that to ever become a problem, but it's better to use a way that's guranteed to work.
da43689 to
c43905f
Compare
|
Hopefully these changes fix the safety comment. For reference: I've also asked about this on zulip: #t-opsem > Is transmuting between `Box<A>` and `Box<B>` UB? @ 💬. |
This commit removes a transmute from
PotentialUtf8::from_boxed_bytesas this might rely on undefined behaviour.It is only sound to transmute between two types like that if their layout is guranteed to be the same. For this particular case it's guranteed that
[u8]andPotentialUtf8have the same layout via the#[repr(transparent)]attribute onPotentialUtf8. This gurantee doesn't seem to extend onBoxas:Boxitself is not marked as#[repr(transparent)](or#[repr(C)])Boxonly gurantees a certain layout for sized types:Given that it seems safer to just perform a pointer cast there instead which only involves types with a known compatible layout.
Practically speaking I wouldn't expect that to ever become a problem, but it's better to use a way that's guranteed to work.
Changelog
potential_utf: Use
Box::from_raw()instead oftransmutefor converting unsized transparent boxes.