Skip to content

Commit 2cd41e1

Browse files
authored
1 parent 54c8225 commit 2cd41e1

File tree

1 file changed

+105
-0
lines changed

1 file changed

+105
-0
lines changed

text/0139-faster-erasure-coding.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# RFC-0139: Faster Erasure Coding
2+
3+
| | |
4+
| --------------- | ------------------------------------------------------------------------------------------- |
5+
| **Start Date** | 7 March 2025 |
6+
| **Description** | Faster algorithm for Data Availability Layer |
7+
| **Authors** | ordian |
8+
9+
## Summary
10+
11+
This RFC proposes changes to the erasure coding algorithm and the method for computing the erasure root on Polkadot to improve performance of both processes.
12+
13+
## Motivation
14+
15+
The Data Availability (DA) Layer in Polkadot provides a foundation for
16+
shared security, enabling Approval Checkers and Collators to download
17+
Proofs-of-Validity (PoV) for security and liveness purposes respectively.
18+
As the number of parachains and PoV sizes increase, optimizing the performance
19+
of the DA layer becomes increasingly critical.
20+
21+
[RFC-47](https://github.com/polkadot-fellows/RFCs/blob/main/text/0047-assignment-of-availability-chunks.md)
22+
proposed enabling systematic chunk recovery for Polkadot's DA to improve
23+
efficiency and reduce CPU overhead. However, while it helps under the assumption of
24+
good network connectivity to a specific one-third of validators (modulo some
25+
backup tolerance on backers), it still requires re-encoding. Therefore,
26+
we need to ensure the system can handle load in the worst-case scenario.
27+
The proposed change is orthogonal to RFC-47 and can be used in conjunction with it.
28+
29+
Since RFC-47 already requires a breaking protocol change (including changes to
30+
collator nodes), we propose bundling another performance-enhancing breaking
31+
change that addresses the CPU bottleneck in the erasure coding process, but using
32+
a separate node feature (`NodeFeatures` part of `HostConfiguration`) for its activation.
33+
34+
## Stakeholders
35+
36+
- Infrastructure providers (operators of validator/collator nodes)
37+
will need to upgrade their client version in a timely manner
38+
39+
## Explanation
40+
41+
We propose two specific changes:
42+
43+
1. Switch to the erasure coding algorithm described in the Graypaper,
44+
Appendix H. SIMD implementations of this algorithm are available in:
45+
46+
- [Rust](https://github.com/AndersTrier/reed-solomon-simd)
47+
- [C++](https://github.com/catid/leopard)
48+
- [Go](https://github.com/celestiaorg/go-leopard)
49+
50+
2. Replace the Merkle Patricia Trie with a Binary Merkle Tree for computing the erasure root.
51+
52+
The reference root merklization implementation can be found [here](https://github.com/paritytech/erasure-coding/blob/512e77472beb877fe0881a857623d54d97b82bc4/src/merklize.rs#L9-L197).
53+
54+
### Upgrade path
55+
56+
We propose adding support for the new erasure coding scheme on both validator and collator sides without activating it until:
57+
1. All validators have upgraded
58+
2. Most collators have upgraded
59+
60+
Block-authoring collators that remain on the old version will be unable to produce valid candidates until they upgrade. Parachain full nodes will continue to function normally without changes.
61+
62+
An alternative approach would be to allow collators to opt-in to the new erasure
63+
coding scheme using a reserved field in the candidate receipt. This would allow
64+
faster deployment for most parachains but would add complexity.
65+
66+
Given there isn't urgent demand for supporting larger PoVs currently, we recommend prioritizing simplicity with a way to implement future-proofing changes.
67+
68+
In short, the following steps are proposed:
69+
1. Implement the changes a and wait for most collators to upgrade.
70+
2. Activate RFC-47 via `Configuration::set_node_feature` runtime change.
71+
3. Activate the new erasure coding scheme using another `Configuration::set_node_feature` runtime change.
72+
73+
## Drawbacks
74+
75+
Bundling this breaking change with RFC-47 might reset progress in updating collators. However, the omni node initiative should help mitigate this issue.
76+
77+
## Testing, Security, and Privacy
78+
79+
Testing is needed to ensure binary compatibility across implementations in multiple languages.
80+
81+
## Performance and Compatibility
82+
83+
### Performance
84+
85+
According to [benchmarks](https://gist.github.com/ordian/0af2822e20bf905d53410a48dc122fd0):
86+
- A proper SIMD implementation of Reed-Solomon is 3-4× faster for encoding and up to 9× faster for full decoding
87+
- Binary Merkle Trees produce proofs that are 4× smaller and slightly faster to generate and verify
88+
89+
### Compatibility
90+
91+
This requires a breaking change that can be coordinated following the same approach as in RFC-47.
92+
93+
## Prior Art and References
94+
95+
JAM already utilizes the same optimizations described in the Graypaper.
96+
97+
## Unresolved Questions
98+
99+
None.
100+
101+
## Future Directions and Related Material
102+
103+
Future improvements could include:
104+
- Using ZK proofs to eliminate the need for re-encoding data to verify correct encoding
105+
- Removing the requirement for collators to compute the erasure root for the collator protocol

0 commit comments

Comments
 (0)