-
Notifications
You must be signed in to change notification settings - Fork 178
FIP-0088: Add support for upgradable actors #873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,157 @@ | ||||||
--- | ||||||
fip: "<to be assigned>" <!--keep the qoutes around the fip number, i.e: `fip: "0001"`--> | ||||||
title: Add support for upgradable actors | ||||||
author: Fridrik Asmundsson (@fridrik01), Steven Allen (@stebalien) | ||||||
discussions-to: https://github.com/filecoin-project/FIPs/discussions/396 | ||||||
status: Draft | ||||||
type: Technical | ||||||
category: Core | ||||||
created: 2023-11-27 | ||||||
--- | ||||||
|
||||||
## Simple Summary | ||||||
|
||||||
This FIP introduces support for upgradable actors, enabling deployed actors to update their code while retaining their address, state, and balance. This feature is currently limited to use by built-in actors, and as of now, no built-in actor has been updated to become upgradable. | ||||||
|
||||||
## Abstract | ||||||
|
||||||
This FIP proposes the integration of upgradable actors into the Filecoin network through the introduction of a new `upgrade_actor` syscall and an optional `upgrade` WebAssembly (Wasm) entrypoint. | ||||||
|
||||||
Upgradable actors provide a framework for seamlessly replacing deployed actor code, significantly enhancing the user experience when updating deployed actor code. | ||||||
|
||||||
## Change Motivation | ||||||
|
||||||
Currently, the code associated with all actors on the Filecoin Network is immutable once deployed. To modify the actor code, such as fixing a security bug, the following steps are required: | ||||||
1. Deploy a new actor with the corrected code. | ||||||
2. Migrate all state from the previous actor to the new one. | ||||||
3. Update all other actors interacting with the old actor to use the new actor. | ||||||
|
||||||
By adding support for upgradable actors, deployed actors can easily upgrade their code and no longer need to go through the series of steps mentioned above. | ||||||
|
||||||
This FIP is also motivated by the `f4` extensible address class which was introduced in [FIP-0048] and required special "placeholder" actors to support interactions with addresses that do not yet exist on-chain. With upgradable actors we can simplify this address class and remove these placeholder actors completely. We will be able to deploy real actors and upgrade their code on first send. | ||||||
|
||||||
Furthermore, this FIP paves the way for moving more network upgrade logic on-chain in the future, enabling a more seamless process for implementing critical updates and ensuring the continuous improvement of the Filecoin Network. | ||||||
|
||||||
## Specification | ||||||
|
||||||
Introducing support for actor upgrades involves the following changes to the FVM: | ||||||
|
||||||
1. Adding a new `upgrade` Wasm entrypoint, which actors must implement in order to be a valid upgrade target. | ||||||
2. Adding a new `upgrade_actor` syscall, enabling actors to upgrade themselves. | ||||||
|
||||||
These changes are discussed in detail in the following sections. | ||||||
|
||||||
### New upgrade Wasm Entrypoint | ||||||
|
||||||
We introduce a new optional `upgrade` Wasm entrypoint. Deployed actors must implement this entrypoint to be a valid upgrade target. It is defined as follows: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why must deployed actors implement it? It's not invoked on deployed actors, it's invoked on the new code. The algorithm below does not specify any check that the deployed actors implement this entry point |
||||||
|
||||||
```rust | ||||||
pub fn upgrade(params_id: u32, upgrade_info_id: u32) -> u32 | ||||||
jennijuju marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
``` | ||||||
|
||||||
Parameters: | ||||||
- `params_id`: An IPLD block handle provided by the caller and sent to the upgrade receiver, or `0` for none. | ||||||
- `upgrade_info_id`: An IPLD block handle for an `UpgradeInfo` struct provided by the FVM runtime (defined below). | ||||||
|
||||||
The single `u32` return value is an IPLD block handle, or `0` for none. | ||||||
|
||||||
The `UpgradeInfo` struct is defined as follows: | ||||||
|
||||||
```rust | ||||||
#[derive(Clone, Debug, Copy, PartialEq, Eq, Serialize_tuple, Deserialize_tuple)] | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: I don't think these Rust codegen things are relevant to the FIP. |
||||||
pub struct UpgradeInfo { | ||||||
// the old code cid we are upgrading from | ||||||
pub old_code_cid: Cid, | ||||||
} | ||||||
``` | ||||||
|
||||||
When a target actor's `upgrade` Wasm entrypoint is called, it can make necessary state tree changes from the calling if needed to its actor code. The `UpgradeInfo` struct provided by the FVM runtime can be used to check what code CID its upgrading from. A successful return from the `upgrade` entrypoint instructs the FVM that it should proceed with the upgrade. The target actor can reject the upgrade by calling `sdk::vm::exit()`` before returning from the upgrade entrypoint. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This first sentence is a bit unclear
Suggested change
|
||||||
|
||||||
### New upgrade_actor syscall | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: the entry point and syscall are presented in the opposite order to which they happen in during an upgrade, which makes grokking the flow a bit harder. I drew a completely wrong idea about the upgrade method until I finished reading the syscall spec. |
||||||
|
||||||
We introduce a new `upgrade_actor` syscall which calls the `upgrade` Wasm entrypoint of the calling actor and then atomically replaces the code CID of the calling actor with the provided code CID, and returns the exit code and block of the return. It is defined as follows: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The algorithm below specifies the opposite order of things: the code is changed then |
||||||
|
||||||
```rust | ||||||
pub fn upgrade_actor( | ||||||
new_code_cid_off: *const u8, | ||||||
params: u32, | ||||||
) -> Result<Send>; | ||||||
fridrik01 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
``` | ||||||
|
||||||
Parameters: | ||||||
- `new_code_cid_off`: The code CID the calling actor should be replaced with. | ||||||
- `params`: The IPLD block handle passed, or `0` for none. | ||||||
|
||||||
The `Send` struct is defined as follows: | ||||||
|
||||||
```rust | ||||||
pub struct Send { | ||||||
// exit code returned by the upgrade endpoint | ||||||
pub exit_code: u32, | ||||||
// the block id/codec/size returned by the upgrade endpoint, or 0 if no block was returned | ||||||
pub return_id: BlockId, | ||||||
pub return_codec: u64, | ||||||
pub return_size: u32, | ||||||
} | ||||||
``` | ||||||
|
||||||
On successful upgrade, this syscall will not return. Instead, the current invocation will "complete" and the return value will be the block returned by the new code's `upgrade` endpoint. If the new code rejects the upgrade (calls `sdk::vm::exit()`) or performs an illegal operation, this syscall will return the exit code plus the error returned by the upgrade endpoint. | ||||||
|
||||||
This syscall will: | ||||||
1. Validate that the pointers passed to the syscall are in-bounds. | ||||||
2. Validate that `new_code_cid_off` is a valid code CID. | ||||||
3. Validate that the calling actor is not currently executing in "read-only" mode. If so, the syscall fails with a "ReadOnly" (13) syscall error. | ||||||
4. Checks whether the calling actor is already on the call stack where it has previously been called on its `invoke` entrypoint (note that we allow calling `upgrade` recursively). If so, the syscall fails with a "Forbidden" (11) syscall error. For example if an actor A has a call stack `A (upgrade -> upgrade -> upgrade)` then that is allowed, while call stack `A -> B -> A (upgrade)` would be rejected. | ||||||
5. Checks that we have space for storing the return block. If not, the syscall fails with a "LimitExceeded" (3) syscall error. | ||||||
6. Start a new Call Manager transaction: | ||||||
1. Validate that the calling actor has not been deleted. If so, the syscall fails with a "IllegalOperation" (2) syscall error. | ||||||
2. Update the actor in the state tree with the new `new_code_cid` keeping the same `state`, `sequence` and `balance`. | ||||||
3. Invoke the target actor's `upgrade` entrypoint. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please be very clear which code is invoked here. Since the prior step changed the code CID, I'm assuming it's the new code, but "target actor" also makes it sound like the old code. |
||||||
4. If the target actor does not implement the `upgrade` entrypoint, the syscall fails with a `ExitCode::SYS_INVALID_RECEIVER` exit code. | ||||||
5. If the target actor aborts the `upgrade` entrypoint by calling `sdk::vm::exit()`, the syscall fails with the provided exit code. | ||||||
7. Apply transaction, committing changes. | ||||||
fridrik01 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
8. Abort the calling actor and return the IPLD block from the `upgrade` entrypoint. | ||||||
|
||||||
## Design Rationale | ||||||
|
||||||
### Additional metadata syscall | ||||||
|
||||||
We considered adding a new `get_old_code_cid` syscall to get the calling actors code CID. That has the benefit of keeping the `upgrade` entrypoint signature consistent with the `invoke` signature. We however rejected that as we felt the benefit didn't outweigh the overhead of adding a new syscall. Furthermore, it did not provide the flexibility of passing in an IPLD handle for a `UpgradeInfo` struct where we can easily add more fields if required. | ||||||
|
||||||
## Backwards Compatibility | ||||||
|
||||||
Full backwards compatibility is expected. | ||||||
|
||||||
## Test Cases | ||||||
|
||||||
Detailed test cases are provided with the implementation. | ||||||
|
||||||
## Security Considerations | ||||||
|
||||||
Upgradable actors pose potential security risks, as users can replace deployed actors' code. However, measures are in place to minimize these risks: | ||||||
|
||||||
- Upgradable actors are opt-in by default, ensuring no impact on currently deployed actors. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand from the text above how this is enforced. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Update: the text above doesn't say this, but I now do understand. It's opt-in because an actor code must have some path that invokes the I suggest this mechanism and property be spelled out more clearly. |
||||||
- Actors can only upgrade themselves, preventing one actor from upgrading another actor to a new version. | ||||||
- We reject re-entrant `upgrade_actor` syscalls, i.e., if some actor `A` is already on the call stack, no "deeper" instance of `A` should be able to call the upgrade syscall. | ||||||
|
||||||
Detailed tests cover these security considerations and edge cases. | ||||||
|
||||||
## Incentive Considerations | ||||||
|
||||||
This FIP does not materially impact incentives in any way. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this true? I'm curious how this change may impact the logic around timing and resource use for network upgrades, as well as the prioritization of certain changes. If actors' code can be much more efficiently upgraded, shouldn't we technically be able to push more updates/small changes more quickly? This change may not effect cryptoeconomics, but I might request we add a line about affecting the logic of what gets included for upgrades. Ultimately a small nit; the approval of the draft shouldn't be gated on this topic. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @kaitlin-beegle I don't think I get why those are incentive considerations rather than product considerations. |
||||||
|
||||||
## Product Considerations | ||||||
|
||||||
This FIP makes it possible to upgrade deployed actors, for example in cases where a bug or security concern was identified in the deployed code, allowing a simple safe way to address such issues which significantly improves the user experience from how it is today. | ||||||
jennijuju marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
## Implementation | ||||||
|
||||||
https://github.com/filecoin-project/ref-fvm/pull/1866 | ||||||
|
||||||
## TODO | ||||||
|
||||||
N/A | ||||||
|
||||||
## Copyright | ||||||
|
||||||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). |
Uh oh!
There was an error while loading. Please reload this page.