-
Notifications
You must be signed in to change notification settings - Fork 243
SIMD-0219: Stricter ABI and Runtime Constraints #219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
15260f0
75fed55
1ddba58
8677cf8
9435f0c
5cff568
74e59a8
361174e
53f1c2c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,141 @@ | ||
| --- | ||
| simd: '0219' | ||
| title: Stricter ABI and Runtime Constraints | ||
| authors: | ||
| - Sean Young | ||
| - Alexander Meißner | ||
| category: Standard | ||
| type: Core | ||
| status: Review | ||
| created: 2025-01-06 | ||
| feature: C37iaPi6VE4CZDueU1vL8y6pGp5i8amAbEsF31xzz723 | ||
| --- | ||
|
|
||
| ## Summary | ||
|
|
||
| Removing pitfalls and foot-guns from the ABI (including syscalls) and runtime. | ||
|
|
||
| ## Motivation | ||
|
|
||
| There are a couple of interactions between dApps and the virtual machine which | ||
| are currently allowed but make no sense and are even dangerous for dApps: | ||
|
|
||
| - CPI verification | ||
| - Allows accidentally using `AccountInfo` structures which the program | ||
| runtime never serialized | ||
| - `AccountInfo` structures can be overwritten by CPI during CPI, causing | ||
| complex side effects | ||
| - Syscall alignment requirements | ||
| - In ABI v0 the account input region has no alignment guarantees (it is | ||
| 1 byte aligned) and ABI v1 has 8 byte alignment. However, there are some | ||
| syscalls such as the reading of sysvars which require 16 byte alignment. | ||
| - VM memory access | ||
| - Bad read accesses to account payload go unnoticed as long as they stay | ||
| within the reserved address space, even if they leave the actual account | ||
| payload | ||
| - Bad write accesses to account payload go unnoticed as long as the original | ||
| value is restored | ||
| - Syscall slice parameters | ||
| - Bad read and write accesses which span nonsensical ranges go unnoticed | ||
|
|
||
| Furthermore, at the moment all validator implementations have to copy | ||
| (and compare) data in and out of the virtual memory of the virtual machine. | ||
| There are four possible account data copy paths: | ||
|
|
||
| - Serialization: Copy from program runtime (host) to virtual machine (guest) | ||
| - CPI call edge: Copy from virtual machine (guest) to program runtime (host) | ||
| - CPI return edge: Copy from program runtime (host) to virtual machine (guest) | ||
| - Deserialization: Copy from virtual machine (guest) to program runtime (host) | ||
|
|
||
| By restricting the allowed behavior of dApps we enable the validator to map | ||
| account payload data directly, avoiding copies and compares. | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| None. | ||
|
|
||
| ## New Terminology | ||
|
|
||
| ### VM memory regions | ||
|
|
||
| The memory regions are (in ABI v0 and v1): | ||
|
|
||
| - Readonly data (`0x100000000..0x200000000`) | ||
| - Stack (`0x200000000..0x300000000`) | ||
| - Heap (`0x300000000..0x400000000`) | ||
| - Instruction meta data | ||
| - Account meta data | ||
| - Account payload address space | ||
| - Instruction payload and program key | ||
|
|
||
| The payload address space of an account is the range in the serialized input | ||
| region (`0x400000000..0x500000000`) which covers the payload and optionally the | ||
| 10 KiB resize padding (if not a loader-v1 program), but not the accounts | ||
| metadata. | ||
|
|
||
| ## Detailed Design | ||
|
|
||
| ### Syscall parameters | ||
|
|
||
| - The following pointers must be on the stack or heap, | ||
| meaning their virtual address is inside `0x200000000..0x400000000`, | ||
| otherwise `SyscallError::InvalidPointer` must be thrown: | ||
| - The destination address of all sysvar related syscalls | ||
| - The pointer in the array of `&[AccountInfo]` / `SolAccountInfo*` | ||
| - The `AccountInfo::data` field, | ||
| which is a `RefCell<&[u8]>` in `sol_invoke_signed_rust` | ||
| - The `AccountInfo::lamports` field, | ||
| which is a `RefCell<&u64>` in `sol_invoke_signed_rust` | ||
| - The following pointers must point to what was originally serialized in the | ||
| input regions by the program runtime, | ||
| otherwise `SyscallError::InvalidPointer` must be thrown: | ||
| - `AccountInfo::key` / `SolAccountInfo::key` | ||
| - `AccountInfo::owner` / `SolAccountInfo::owner` | ||
| - `AccountInfo::lamports` / `SolAccountInfo::lamports` | ||
| - `AccountInfo::data::ptr` / `SolAccountInfo::data` | ||
|
|
||
| ### VM memory access | ||
|
|
||
| Memory accesses (both by the program and by syscalls) which span across memory | ||
| mapping regions are considered access violations. Accesses to multiple regions | ||
| (e.g. by memcpy syscalls) have to be split into multiple separate accesses, | ||
| one for each region. | ||
|
|
||
| For all memory accesses to the payload address space of an account which is | ||
| flagged as writable and owned by the currently executed program, check that: | ||
|
|
||
| - The access is completely within the maximum account length, | ||
| otherwise `InstructionError::InvalidRealloc` must be thrown. | ||
| - The access is completely within the rest of the account growth budget of the | ||
| transaction, otherwise `InstructionError::InvalidRealloc` must be thrown. | ||
| - The access is completely within the current length of the account, | ||
| otherwise extend the account with zeros to the maximum allowed by the previous | ||
| two checks. | ||
|
|
||
| For loads / read accesses to the payload address space of an account check | ||
| that: | ||
|
|
||
| - The access is completely within the current length of the account, | ||
| otherwise `InstructionError::AccountDataTooSmall` must be thrown. | ||
|
|
||
| For stores / write accesses to the payload address space of an account check | ||
| that: | ||
|
|
||
| - The account is flagged as writable, | ||
| otherwise `InstructionError::ReadonlyDataModified` must be thrown | ||
| - The account is owned by the currently executed program, | ||
| otherwise `InstructionError::ExternalAccountDataModified` must be thrown. | ||
|
|
||
| ## Impact | ||
|
|
||
| These restrictions have been extensively tested by replay against MNB. | ||
| Most of the dApps devs whose dApps would fail have been contacted and had | ||
| their dApps fixed already. | ||
|
|
||
| Programs which used the SDKs account realloc function, which is now deprecated, | ||
| should upgrade in order to avoid the read-before-write access to uninitialized | ||
| memory. | ||
|
|
||
| ## Security Considerations | ||
|
|
||
| None. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are surely some security considerations (additional validation logic risks introducing more places where clients can diverge), but most of it is already implied by above.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Most of the security risks come from the implementation of direct mapping, not so much from imposing these additional restrictions. We decided to split the behavior changes (constituting this SIMD) from the direct mapping implementation, see: #219 (comment) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will the implementation of direct mapping itself have a separate feature gate?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Likely yes, though it is kind of an implementation detail, as in we can't make a SIMD for it as that would be empty. |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this also apply here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because the way reallocations / growing of accounts currently works in ABIv1 is that a program first writes beyond the end of the account, and then at the next CPI or the end of the instruction communicates the change of the account length to the program runtime.