Skip to content

[Feature] Support program upgradability. #2655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 66 commits into
base: feat/constructors-and-operands
Choose a base branch
from

Conversation

d0cd
Copy link
Collaborator

@d0cd d0cd commented Mar 19, 2025

[NO NEW COMMITS ARE ALLOWED AS OF 5/27/25]

DO NOT MERGE UNTIL BASE IS UPDATED TO staging

Motivation

This PR

  • supports program upgradability, by allowing programs with constructors to be updated. Constructors are immutable, allowing developers to implement custom update logic for their applications.
  • modifies the program checksum to use SHA256 and changes its type from field to a [u8; 32u32]
  • strictly defines program equality to also include order
  • adds an optional program_checksum field to Deployment, which implicitly versions the object.
  • updates the deployment ID calculation to use the program checksum. Benchmarks for the checksum calculation are included below.

Test Plan

  • Process-level tests were added to verify the syntactic restrictions on upgrades.
  • VM/Ledger-level tests were added to verify upgrade semantics.

Benchmarks

Run on a 2021 MBP w/ an M1 Max & 32GB RAM

  • SHA3_256 Hash - input size 1000 bytes - [3.4339 µs 3.4435 µs 3.4539 µs]
  • SHA3_256 Hash - input size 10000 bytes - [28.346 µs 28.500 µs 28.747 µs]
  • SHA3_256 Hash - input size 100000 bytes - [280.97 µs 282.03 µs 283.11 µs] (max program size)
  • SHA3_256 Hash - input size 1000000 bytes - [2.7798 ms 2.7846 ms 2.7907 ms]

@d0cd d0cd mentioned this pull request Mar 19, 2025
21 tasks
@d0cd d0cd requested review from vicsn and raychu86 and removed request for vicsn March 19, 2025 15:49
@d0cd d0cd force-pushed the feat/program-updatability branch from cc239a5 to 9d94734 Compare March 19, 2025 22:33
@d0cd d0cd force-pushed the feat/program-updatability branch from f27bcae to 3b02904 Compare March 20, 2025 18:17
@@ -186,7 +197,9 @@ pub trait DeploymentStorage<N: Network>: Clone + Send + Sync {
atomic_batch_scope!(self, {
// Store the program ID.
self.id_map().insert(*transaction_id, program_id)?;
// Store the edition.
// Store the edition for the transaction ID.
self.id_edition_map().insert(*transaction_id, edition)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will insert a 0 edition for new programs, but have a missing item for existing programs

@@ -92,6 +92,8 @@ impl<N: Network> TransactionStorage<N> for TransactionMemory<N> {
pub struct DeploymentMemory<N: Network> {
/// The ID map.
id_map: MemoryMap<N::TransactionID, ProgramID<N>>,
/// The ID edition map.
id_edition_map: MemoryMap<N::TransactionID, u16>,
/// The edition map.
edition_map: MemoryMap<ProgramID<N>, u16>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to rename this to latest_edition_map.

Should be fine on the rocksdb level, but will need to do some testing and ensure existing ledgers boot and operate properly.

@@ -786,6 +790,8 @@ impl<N: Network, C: ConsensusStorage<N>> VM<N, C> {
/// - The transaction is producing a duplicate output
/// - The transaction is producing a duplicate transition public key
/// - The transaction is another deployment in the block from the same public fee payer.
/// - The transaction is an execution for a program following its deployment or redeployment in this block.
Copy link
Collaborator

@vicsn vicsn Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still in favour of aborting executions which reference a state root which is older than the latest program upgrade corresponding to any of the transitions in the execution.

Copy link
Collaborator Author

@d0cd d0cd Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a fair point.
The lookup could be done by:

  1. Look up the latest transaction ID that deployed or upgraded a program.
  2. Look up the corresponding block height for the program.
  3. Look up the corresponding block height for the state root.
  4. Verify that the state root's block height is less than the program's last deployed/updated height.

One thing to consider is that this check would create a dependence on the state root for "non-record" transactions. Concretely, exchanges and other developer services tend to "fix" a state root to reduce the number of network calls needed for their transactions. It could change the UX for consumer software but is not an unreasonable thing to do.

I'll leave it to reviewers to arbitrate.

Copy link
Collaborator

@vicsn vicsn Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great. I can't imagine this to be a major performance bottleneck.

However, we do need to check the # program for each transition. So maybe a cache is warranted.

CC @raychu86

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more reason why this is a good idea:

If a user's local state is behind the network, then they might use an outdated edition or checksum value as witness, and as a result, the proof will fail to verify with an extremely opaque error message.

Now, unfortunately, users are used to transactions failing for opaque reasons. But we can imagine a point where explorers surface this information to users, or developers see this in validator logs for debugging purposes.

It would be a lot better if the user or developer would see e.g. "Execution failed due to the use of a state root older than the latest program upgrade."

Copy link
Collaborator

@vicsn vicsn Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the same reason as my last message above (verifier witness data may change after a program update), we should clear executions from the partially_verified_transactions cache when the underlying program for any of its transitions has been updated.

This should happen already before a block is finalized, so that if a program is updated, the next execution within the same block might now be rejected/aborted during finalization.

Maybe this ends up being superfluous if we add a separate explicit check for the state root, but it doesn't hurt to evict a cache early.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can pull in ecosystem developers into this code review weigh in?

I like it! I'll share a plan with you.

Note that in any case we will need to clear appropriate executions from partially_verified_transactions - though the location where it is cleared might depend on that user feedback.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a branch with the check (diff)
I'll make a note of it in the tracking issue as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refreshed the above branch to do the check in should_abort_transaction

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I genuinely would love to require a recent state root to ensure freshness.

However, I do feel for offline wallets, cold storage schemes, and HSMs (in the future). Namely, that they do not have the privilege or luxury to go and fetch live state like this on a continual basis. And a custodian servicing millions of addresses for their customers does not have the bandwidth to stay on top of live refreshes like this (with today's state of Aleo infrastructure and tooling).

As such, I strongly recommend keeping the design in its current form and re-evaluating down the line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree that fetching live state is a pain for offline wallets, HSMs, etc.
However, these solutions will already need to do this to spend records.

The proposed design can use any state root after the the program has been upgraded.
Moreover, it's unlikely for programs to get upgraded frequently, even once a month would be a lot.
And so is unlikely to add meaningful overhead to existing requirement for state root fetching.

This restriction does improve security for users, limiting the ways a malicious developer could "rug" a user.

/// The default value.
default: Operand<N>,
/// The operands.
operands: [Operand<N>; 2],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change here? It felt more clear with the direct variables key and default.

At the very least, the documentation of operands should reflect the structure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto with the other commands that have this unification

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was needed to support the operands() on commands without allocating Vecs
The instructions are actually implemented in a similar way.
TBH the implementations of both instructions and commands needs some refactoring to standardize.

Copy link
Collaborator Author

@d0cd d0cd Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For additional context, we need to iterate through all of the operands in a program and check that they don't match Operand::Checksum or Operand::Edition before a specific consensus version. It would be ideal if we can do this without cloning and allocating.

@d0cd d0cd requested a review from vicsn April 15, 2025 16:46
@d0cd d0cd requested a review from raychu86 April 18, 2025 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants