Skip to content

Compress the state response to reduce the state sync data transfer #5312

@liuchengxu

Description

@liuchengxu

Is there an existing issue?

  • I have searched the existing issues

Experiencing problems? Have you tried our Stack Exchange first?

  • This is not a support question.

Motivation

The state syncing could download several GiB of data if the state size of the chain is huge, which is not uncommon nowadays.
This poses a significant challenge for nodes with slow network connections. Additionally, since state sync currently lacks a persistence feature (#4), any network disruption forces the node to re-download the entire state, which is annoying.

Request

Reduce the state syncing download size.

Solution

Compress the state response before sending it to the node and uncompress the state response on the receiver side.

diff --git a/substrate/client/network/sync/Cargo.toml b/substrate/client/network/sync/Cargo.toml
index 17e3e2119d..047fffa31f 100644
--- a/substrate/client/network/sync/Cargo.toml
+++ b/substrate/client/network/sync/Cargo.toml
@@ -48,6 +48,7 @@ sp-consensus = { workspace = true, default-features = true }
 sp-core = { workspace = true, default-features = true }
 sp-consensus-grandpa = { workspace = true, default-features = true }
 sp-runtime = { workspace = true, default-features = true }
+zstd = { workspace = true }

 [dev-dependencies]
 mockall = { workspace = true }
diff --git a/substrate/client/network/sync/src/engine.rs b/substrate/client/network/sync/src/engine.rs
index bb6e7a98a8..3915d3845e 100644
--- a/substrate/client/network/sync/src/engine.rs
+++ b/substrate/client/network/sync/src/engine.rs
@@ -1204,7 +1204,10 @@ where
        }

        fn decode_state_response(response: &[u8]) -> Result<OpaqueStateResponse, String> {
-               let response = StateResponse::decode(response)
+               let response = zstd::stream::decode_all(response).expect("Failed to uncompress state response");
+               let response = StateResponse::decode(response.as_slice())
                        .map_err(|error| format!("Failed to decode state response: {error}"))?;

                Ok(OpaqueStateResponse(Box::new(response)))
diff --git a/substrate/client/network/sync/src/state_request_handler.rs b/substrate/client/network/sync/src/state_request_handler.rs
index 0e713626ec..bb07bdd9bc 100644
--- a/substrate/client/network/sync/src/state_request_handler.rs
+++ b/substrate/client/network/sync/src/state_request_handler.rs
@@ -264,7 +272,15 @@ where

                        let mut data = Vec::with_capacity(response.encoded_len());
                        response.encode(&mut data)?;
-                       Ok(data)
+                       let compressed_data = zstd::stream::encode_all(data.as_slice(), 0).expect("Failed to compress state response");
+                       Ok(compressed_data)
                } else {
                        Err(())
                };

This is a low-hanging fruit that can reduce the state sync data significantly as demonstrated by my local experiments. I conducted state sync tests at various block heights (before height 300000) using both the fast and fast-unsafe modes for subcoin, the Uncompressed Total State Sync Data is calculated as sum(data.len()), the Compressed Total State Sync Data is calculated as sum(compressed_data.len()). The results are promising, indicating that several GiB of state sync data can be saved, especially when dealing with large chain states. The final state size of subcoin may be 12+GiB, this optimization will greatly help the state sync of subcoin.

--sync Uncompressed Total State Sync Data (bytes) Compressed Total State Sync Data (bytes) Compressed/Uncompressed
fast-unsafe 149,517,161 50,284,623 0.34
fast-unsafe 205,400,559 70,742,393 0.34
fast-unsafe 597,683,313 202,993,329 0.34
fast-unsafe 1,239,830,694 480,632,754 0.39
fast-unsafe 2,182,810,408 841,870,855 0.39
fast 820,180,264 338,889,711 0.41
fast 1,486,307,891 631,430,018 0.42

We can make this configurable if necessary.

Are you willing to help with this request?

Yes!

Metadata

Metadata

Assignees

No one assigned

    Labels

    I5-enhancementAn additional feature request.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions