-
Notifications
You must be signed in to change notification settings - Fork 165
Fix assign unsoundness
#180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sdk/pinocchio/src/account_info.rs
Outdated
| /// It is undefined behaviour to use [`Self::assign`] while there is an active reference | ||
| /// to the `owner` returned by this method. | ||
| #[inline(always)] | ||
| pub unsafe fn owner(&self) -> &Pubkey { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should revert the unsafe here. The issue is with assign(), so having that one as unsafe might be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest keeping it as unsafe to make it clear that there are unsound ways to use the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems correct to me.
I think if you change the return type to Pubkey, then both functions become safe.
According to the "Pointer to reference conversion", only the lifetimes of the references matter.
And if owner() does not return a reference, then the lifetimes of both the shared and the mutable references to the self.raw.owner bytes are contained within the owner() and assign() function calls respectively.
I think AccountInfo should already be !Send, due to the *mut T being !Send.
And so, it is impossible to construct two simultaneously alive references to the owner memory then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest keeping it as
unsafeto make it clear that there are unsound ways to use the function
One problem is that if you have too many unsafe blocks, you tend to pay less attention when checking each one of them.
Assuming it is much more common to call owner(), by removing unsafe we reduce the mental burden of using this API.
And in turn, if someone uses assign() then now will hopefully pay more attention, when verifying that the assign() call they are performing is correct.
After all, they only need verification of the assign() invocations.
If they check that it works correctly, then everything is fine.
At the same time, an addition of a new owner() call to an existing, undefined behavior free, code base may now cause undefined behavior.
So, maybe both indeed need to be unsefe, as long as owner() returns a reference and there is no runtime checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If assign didn't exist, owner would be safe. I'm in favor of marking owner as safe as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which would essentially give us the first version, just with a different order of register assignments before the call.
I didn't follow your point. If you return a copy, one would expect the compiler to actually copy the bytes, right?
Are you suggesting that we should not actually copy in that case?
I would expect it to avoid copying bytes if it is unnecessary.
There are many optimizations that ensure unnecessary copies are not performed.
I think they are quite essential in supporting the modern style of writing in C++ and in Rust.
I only expect the result of the assembly execution to match what the Rust code is suggesting should happen.
My understanding is that if the assembly matched the higher level language 1 to 1, there would be no reasons for the undefined behavior to exist.
I checked the code in the godbolt site. In the copy example, we indeed copy the return value to the stack, according to the call convention. There is actually a limit the compiler use to decide whether to promote a copy argument/return value to a reference argument/return value. Certainly that can be lowered.
We often deal with addresses that are 32 bytes in size.
If there is a parameter that would allow the compiler to avoid unnecessary copies of objects of this size, I think, it would make sense to look into adjusting it.
The x86 has enough registers to hold the both keys, so the copy in godbolt happens implicitly (memory -> registers -> then compare). This is why both codes are the same. We do not have as many registers, so stack spills are more common.
I do not think we are comparing values in registers.
Here is how I understand the second code block example that Febo shared:
LBB0_13:
# I do not fully understand this assignment. I'm not familiar with the SVM
# calling conventions. But, eventually, `*(r2 + 0) * 2 + 8` is the target
# Pubkey we are comparing the account owner Pubkey with. `r2` is only used
# to compute this value.
ldxdw r1, [r2 + 0]
# `r10` is probably stack?
# We access data with rather big offsets relative to `r10` in both code
# versions. And we subtract from it, which matches how stack stack pointer
# is normally used.
# `r6` is probably `self` we got from the caller.
ldxdw r6, [r10 - 544]
# Copy 32 bytes from `r6 + 40` to `r10 - 32`.
# This is going to be the second argument to our `memcmp` call.
ldxdw r3, [r6 + 40]
stxdw [r10 - 32], r3
ldxdw r3, [r6 + 48]
stxdw [r10 - 24], r3
ldxdw r3, [r6 + 56]
stxdw [r10 - 16], r3
ldxdw r3, [r6 + 64]
stxdw [r10 - 8], r3
# At this point `r10 - 32` and `r6 + 40` point to identical 32 bytes.
# Prepare a `memcmp` call.
# `r2` is `*(r2 + 0) * 2 + 8`
add64 r2, r1
add64 r2, 8
# `r1` is `r10 - 32`
mov64 r1, r10
add64 r1, -32
# `r3` is `32`
mov64 r3, 32
call memcmp
# Jump if `memcmp` returned zero in the lower 32 bits of `r0`.
lsh64 r0, 32
rsh64 r0, 32
jeq r0, 0, LBB0_15There is a redundant copy here.
We copy 32 bytes from r6 + 40 to r10 - 32, and then call memcmp(r10 - 32, r1 + 8, 32).
We can just call memcmp(r6 + 40, r1 + 8, 32) to get the same result.
This copy is redundant.
It does not seem to matter how many registers you have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dereferencing a pointer implies loading it to a register or copying the value if it does not fit there.
I understand what you are pointing in the SBF target. Since everything is inlined, the copy to the stack appears unnecessary.
I want to bring here that in the example in godbolt compiled for x86 or aarch64, you see the same code for both the copied version and the reference version. That happens because both targets have enough registers to hold two pub keys and then compare them with compare instructions. For them, you actually see the pub keys being loaded in registers and then compared. That won't change depending on the case, since the machine instructions are actually the same (if you pass by reference, you'll also need to load them to registers before the compare instructions).
In the SBF target, there is not enough registers to hold the pubkeys. Regardless of whether we emit compares or the syscall, the copy will be explicit. (There is threshold defined for when to invoke the syscall and when to compare with instructions).
If we want to extrapolate that and do a fair comparison, you can set the size of the pub key to 1024 bytes. The copied version (in x86 and aarch64) will have a call to memcpy (dereferencing the pointer) and another call to memcmp, even though you could just forward the pointer to memcmp. See https://godbolt.org/z/z3GaeddEs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the explanation.
I still do not fully understand the mechanics, but it is indeed happening as you are saying it does.
A diff is a bit easier for me to read.
Here is a diff of the two versions with the Pubkey size increased to 1024: https://rust.godbolt.org/z/33raodq7M
It is pretty strange that it first uses all the xmm registers to copy everything in memory, and then does an unnecessary memcpy call for the "by value" case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe what's causing the copy is not the "by value" or "by copy" case. Every argument or return passed by value whose size is larger than 128 bits is promoted to a reference.
The pointer dereference is triggering a copy. For example, if you do let a = *(ptr as *const u64), you are copying the value of the pointer to a variable. Likewise, (*self.raw).owner triggers a copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe what's causing the copy is not the "by value" or "by copy" case. Every argument or return passed by value whose size is larger than 128 bits is promoted to a reference.
The pointer dereference is triggering a copy. For example, if you do
let a = *(ptr as *const u64), you are copying the value of the pointer to a variable. Likewise,(*self.raw).ownertriggers a copy.
self.raw is a pointer to an Account, while (*self.raw).owner is a Pubkey.
We see an unnecessary copy of the Pubkey.
You showed me an example where an unnecessary copy of a Pubkey was elided in BPF.
Though, your example could have been a special case of the return value optimization (at least this is what it is called in C++).
Assuming the compiler can avoid copying an object of the size of Pubkey in a general case. I would appreciate it if you could help me understand what operation in this particular example might be preventing the optimizations to kick in.
joncinque
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
sdk/pinocchio/src/account_info.rs
Outdated
| /// It is undefined behaviour to use [`Self::assign`] while there is an active reference | ||
| /// to the `owner` returned by this method. | ||
| #[inline(always)] | ||
| pub unsafe fn owner(&self) -> &Pubkey { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest keeping it as unsafe to make it clear that there are unsound ways to use the function
* Fix assign unsoundness * Remove unsafe
|
lgtm |
Problem
The current implementation of
AccountInfo::assignis unsound since it takes an immutable reference toownerand casts it to a mutable one:Solution
Use a mutable reference to
ownerdirectly. This PR also clarifies the safety assumptions forowner()andassign().cc: @d0nutptr