Compress the state response to reduce the state sync data transfer

### Is there an existing issue?

- [X] I have searched the existing issues

### Experiencing problems? Have you tried our Stack Exchange first?

- [X] This is not a support question.

### Motivation

The state syncing could download several GiB of data if the state size of the chain is huge, which is not uncommon nowadays. 
This poses a significant challenge for nodes with slow network connections. Additionally, since state sync currently lacks a persistence feature (#4), any network disruption forces the node to re-download the entire state, which is annoying.

### Request

Reduce the state syncing download size.

### Solution

Compress the state response before sending it to the node and uncompress the state response on the receiver side.

```diff
diff --git a/substrate/client/network/sync/Cargo.toml b/substrate/client/network/sync/Cargo.toml
index 17e3e2119d..047fffa31f 100644
--- a/substrate/client/network/sync/Cargo.toml
+++ b/substrate/client/network/sync/Cargo.toml
@@ -48,6 +48,7 @@ sp-consensus = { workspace = true, default-features = true }
 sp-core = { workspace = true, default-features = true }
 sp-consensus-grandpa = { workspace = true, default-features = true }
 sp-runtime = { workspace = true, default-features = true }
+zstd = { workspace = true }

 [dev-dependencies]
 mockall = { workspace = true }
diff --git a/substrate/client/network/sync/src/engine.rs b/substrate/client/network/sync/src/engine.rs
index bb6e7a98a8..3915d3845e 100644
--- a/substrate/client/network/sync/src/engine.rs
+++ b/substrate/client/network/sync/src/engine.rs
@@ -1204,7 +1204,10 @@ where
        }

        fn decode_state_response(response: &[u8]) -> Result<OpaqueStateResponse, String> {
-               let response = StateResponse::decode(response)
+               let response = zstd::stream::decode_all(response).expect("Failed to uncompress state response");
+               let response = StateResponse::decode(response.as_slice())
                        .map_err(|error| format!("Failed to decode state response: {error}"))?;

                Ok(OpaqueStateResponse(Box::new(response)))
diff --git a/substrate/client/network/sync/src/state_request_handler.rs b/substrate/client/network/sync/src/state_request_handler.rs
index 0e713626ec..bb07bdd9bc 100644
--- a/substrate/client/network/sync/src/state_request_handler.rs
+++ b/substrate/client/network/sync/src/state_request_handler.rs
@@ -264,7 +272,15 @@ where

                        let mut data = Vec::with_capacity(response.encoded_len());
                        response.encode(&mut data)?;
-                       Ok(data)
+                       let compressed_data = zstd::stream::encode_all(data.as_slice(), 0).expect("Failed to compress state response");
+                       Ok(compressed_data)
                } else {
                        Err(())
                };
```

This is a low-hanging fruit that can reduce the state sync data significantly as demonstrated by my local experiments. I conducted state sync tests at various block heights (before height 300000) using both the `fast` and `fast-unsafe` modes for [subcoin](https://github.com/subcoin-project/subcoin), the Uncompressed Total State Sync Data is calculated as `sum(data.len())`, the Compressed Total State Sync Data is calculated as `sum(compressed_data.len())`. The results are promising, indicating that several GiB of state sync data can be saved, especially when dealing with large chain states. The final state size of subcoin may be 12+GiB, this optimization will greatly help the state sync of subcoin.

| **`--sync`** | **Uncompressed Total State Sync Data (bytes)** | **Compressed Total State Sync Data (bytes)** | **Compressed/Uncompressed** |
|----------------------|-------------------------------|-----------------------------|-----------------------|
| **fast-unsafe**      | 149,517,161                   | 50,284,623                  | 0.34                  |
| **fast-unsafe**      | 205,400,559                   | 70,742,393                  | 0.34                  |
| **fast-unsafe**      | 597,683,313                   | 202,993,329                 | 0.34                  |
| **fast-unsafe**      | 1,239,830,694                 | 480,632,754                 | 0.39                  |
| **fast-unsafe**      | 2,182,810,408                 | 841,870,855                 | 0.39                  |
| **fast**             | 820,180,264                   | 338,889,711                 | 0.41                  |
| **fast**             | 1,486,307,891                 | 631,430,018                 | 0.42                  |

We can make this configurable if necessary.


### Are you willing to help with this request?

Yes!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compress the state response to reduce the state sync data transfer #5312

Is there an existing issue?

Experiencing problems? Have you tried our Stack Exchange first?

Motivation

Request

Solution

Are you willing to help with this request?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

`--sync`	Uncompressed Total State Sync Data (bytes)	Compressed Total State Sync Data (bytes)	Compressed/Uncompressed
fast-unsafe	149,517,161	50,284,623	0.34
fast-unsafe	205,400,559	70,742,393	0.34
fast-unsafe	597,683,313	202,993,329	0.34
fast-unsafe	1,239,830,694	480,632,754	0.39
fast-unsafe	2,182,810,408	841,870,855	0.39
fast	820,180,264	338,889,711	0.41
fast	1,486,307,891	631,430,018	0.42

Compress the state response to reduce the state sync data transfer #5312

Description

Is there an existing issue?

Experiencing problems? Have you tried our Stack Exchange first?

Motivation

Request

Solution

Are you willing to help with this request?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions