feat: Da sequencer, add state propagation#1226
Conversation
…1087) Co-authored-by: Andreas Penzkofer <36140574+apenzk@users.noreply.github.com> Co-authored-by: Mikhail Zabaluev <mikhail.zabaluev@movementlabs.xyz> Co-authored-by: Liam Monninger <79056955+l-monninger@users.noreply.github.com> Co-authored-by: Andy Golay <andygolay@gmail.com> Co-authored-by: musitdev <philippe.delrieu@free.fr> Co-authored-by: Richard Melkonian <35300528+0xmovses@users.noreply.github.com> Co-authored-by: primata <primata@movementlabs.xyz> Co-authored-by: Richard Melkonian <r.v.melkonian@gmail.com> Co-authored-by: Radu Popa <radupopa21be@gmail.com> Co-authored-by: Liam Monninger <l.mak.monninger@gmail.com>
"Scenarios per client", removed the "number_" prefix which was a handful to type and didn't add clarity.
|
@musitdev the unit test passes for me, but for the e2e test, I'm wondering, is expected, for
I was able to run the test, and it looks like some scenarios failed. I'll DM you the run logs. |
| ) | ||
| .await | ||
| tokio::spawn({ | ||
| let da_signer = self |
There was a problem hiding this comment.
@musitdev I think it might be better to move these into a config struct for the exec_settle_task at this point. Would you like for me to drop that in a PR?
There was a problem hiding this comment.
Yes, if you want. As I say later, it's still a test to understand better why state of node changes. So it can be removed depending on the cause of the state divergence.
There was a problem hiding this comment.
Okay, I will drop something tomorrow.
| @@ -0,0 +1,117 @@ | |||
| use maptos_opt_executor::executor::ExecutionState; | |||
There was a problem hiding this comment.
Add module comments for this.
There was a problem hiding this comment.
Ok, I do that. I'll put the explanation that I'm going to write to your next question.
| bytes signature = 2; | ||
| } | ||
|
|
||
| message MainNodeState { |
There was a problem hiding this comment.
This makes more sense as LeaderNodeState. Why the change to Main*?
There was a problem hiding this comment.
Leader* was never meant to refer to nodes that gossip state to to Follower*. That was a misconception that you guys decided to advance, additionally shutting down the transaction posting from followers.
The Leader* produces blocks as well as executing. The Follower* receives blocks and executes them.
There was a problem hiding this comment.
Yes—propagating a state isn’t our goal. I added this feature because we’ve seen edge cases that cause nodes to diverge. Without on‐chain consensus, we need a “source of truth” node to decide which state is correct. I’ve called it the main node (to avoid confusing it with the leader node, which has a different responsibility). The main node’s only job is to broadcast its state; it has no other special functions.
Originally, this was just a test harness to uncover divergence scenarios. I could have used ad-hoc debug RPCs, but I decided to build a permanent, robust solution. Even after we find and fix specific bugs, we can’t guarantee they’ll never reappear. As partners rely more on their follower nodes—and as network load increases—early detection of divergence becomes critical. We’ll use this implementation first to detect divergence, then decide whether to keep it long-term.
We disabled transaction submission on follower nodes because our current DA light-node can’t handle Celestia back-pressure when multiple batches arrive simultaneously. On mainnet we discovered partners were unintentionally using their follower nodes: Celestia blobs backed up in the client, and transactions took minutes to include in a block.
The new DA-Sequencer architecture changes this:
- The DA-Sequencer produces and publishes blocks.
- Full nodes handle everything else (RPC, execution, indexing, gRPC, etc.).
| use tracing::info; | ||
|
|
||
| #[derive(Debug, Clone)] | ||
| pub struct ExecutionState { |
There was a problem hiding this comment.
movement/util/movement-types/src/block.rs
Line 175 in a537dc0
BlockCommitment already has a height and an id. Commitments would match on timestamp and version too. What's the need for adding this separate struct versus including it in BlockCommitment?
There was a problem hiding this comment.
They don't share the same use case. Change on one struct shouldn't affect the other. They must be separated.
There was a problem hiding this comment.
Maybe explain briefly why they must be separated as a doc comment /// above the struct.
There was a problem hiding this comment.
Pull Request Overview
This PR implements state propagation from the main node to full nodes, adding a new gRPC endpoint (send_state), configuration for the main node, and verification of state consistency between nodes.
- Introduces a state verifier module to store and validate node states.
- Enhances task execution in the DA sequencer client to propagate execution state and verify it.
- Updates configuration and deployment scripts to support the new state propagation mechanism.
Reviewed Changes
Copilot reviewed 41 out of 41 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| networks/movement/movement-full-node/src/node/tasks/state_verifier.rs | Adds a lightweight state verifier used to track and validate a limited history of node states. |
| networks/movement/movement-full-node/src/node/tasks/execute_settle.rs | Updates the block processing logic to include validation of state propagation, as well as spawning task to send new state to the DA sequencer. |
| networks/movement/movement-full-node/src/node/tasks/mod.rs | Registers the new state verifier module. |
| networks/movement/movement-full-node/src/node/partial.rs | Updates task spawning to supply new configuration options related to state propagation. |
| Cargo.toml, shell scripts, docs | Reflect configuration and dependency updates supporting the new state propagation functionality. |
0xmovses
left a comment
There was a problem hiding this comment.
The unit tests and e2e test ran successfully. Output for e2e`movement-tests-load-tests was :
Finished `dev` profile [unoptimized + debuginfo] target(s) in 2m 54s
Running `target/debug/movement-tests-e2e-load-alice-bob`
2025-05-12T13:53:16.421964Z INFO movement_tests_e2e_load_alice_bob: final balances, Alice: 99999998400, Bob: 200
at networks/movement/movement-client/src/bin/e2e/load_alice_bob.rs:162
2025-05-12T13:53:16.423368Z INFO movement_tests_e2e_load_alice_bob: final balances, Alice: 99999998400, Bob: 200
at networks/movement/movement-client/src/bin/e2e/load_alice_bob.rs:162
2025-05-12T13:53:16.425542Z INFO movement_tests_e2e_load_alice_bob: final balances, Alice: 99999998400, Bob: 200
at networks/movement/movement-client/src/bin/e2e/load_alice_bob.rs:162
2025-05-12T13:53:16.425668Z INFO movement_tests_e2e_load_alice_bob: final balances, Alice: 99999998400, Bob: 200
at networks/movement/movement-client/src/bin/e2e/load_alice_bob.rs:162
2025-05-12T13:53:16.426223Z INFO movement_tests_e2e_load_alice_bob: final balances, Alice: 99999998400, Bob: 200
at networks/movement/movement-client/src/bin/e2e/load_alice_bob.rs:162
2025-05-12T13:53:16.427455Z INFO movement_tests_e2e_load_alice_bob: final balances, Alice: 99999998400, Bob: 200
at networks/movement/movement-client/src/bin/e2e/load_alice_bob.rs:162
2025-05-12T13:53:16.427598Z INFO movement_tests_e2e_load_alice_bob: final balances, Alice: 99999998400, Bob: 200
at networks/movement/movement-client/src/bin/e2e/load_alice_bob.rs:162
2025-05-12T13:53:16.428009Z INFO movement_tests_e2e_load_alice_bob: final
I've left comments around style, implementation questions and doc improvements
| .map_err(|_| anyhow::anyhow!("Invalid main_node_verifying_key key")) | ||
| }) | ||
| }) | ||
| .transpose()?; |
There was a problem hiding this comment.
Seems quite unwieldily just to get the verify_key. If we expect this verifying_key to be accessed elsewhere, abstract the map() call into some method on main_node_verifying_key.
| tracing::warn!("Bad node state data, no state in it."); | ||
| return Ok(tonic::Response::new(BatchWriteResponse { answer: false })); | ||
| } | ||
| let state = state_data.state.unwrap(); //unwrap tested before. |
There was a problem hiding this comment.
//unwrap tested before. where, how? Maybe a more detailed comment about why its ok to unwrap() here, as I don't know what it is and would rather it be out.
There was a problem hiding this comment.
It's just the statement before:
if state_data.state.is_none() {
If you prefer I can to a match; perhaps it's more clear. I change it to a match.
There was a problem hiding this comment.
Honestly, I liked anyhow::Context best here.
let state = state_data.context("state data should not be None")?;| use tracing::info; | ||
|
|
||
| #[derive(Debug, Clone)] | ||
| pub struct ExecutionState { |
There was a problem hiding this comment.
Maybe explain briefly why they must be separated as a doc comment /// above the struct.
|
The updated E2E test has passed for me:
Pending resolution of @l-monninger and @0xmovses change requests, I think this is good to merge. |
Co-authored-by: Richard Melkonian <35300528+0xmovses@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…/movement into musitdev/propagate_state
| Some(url) => url.to_string_lossy().into_owned(), | ||
| None => "http://movement-da-sequencer:30730/".to_string(), | ||
| }; | ||
| tracing::info!("updating da-sequecner connection url with:{new_url}"); |


Summary
protocol-units,networks,scripts,util,cicd, ormisc.Implementation of the issue #1222
Some commits are already in feature branch because I've used squash and merge in the other PR instead of simple merge, so as this branch start from the other PR branch, git can't re-associate them.
The setup and test part is already merged in feature branch.
Add state propagation from a main node to all other full node.
Full node validate the received state with the one they have at the same height.
If it diverges the node is stopped.
The propagation use the da-sequencer grpc service.
Start updating the follower node setup / migration doc update.
Changelog
send_stateTesting
Unit test of the state validation:
e2e test:
Outstanding issues