Merkle update compression #1944

Mustang98 · 2025-12-11T02:17:57Z

Optimized compression of the MERKLE_UPDATE part of the block based on the efficient state usage. Leads to up to 50% decrease in the block size.

… decompression

… more broadcasts.

…ssion algorithm

…sion.

…record

…ed block

…r deserialize_block_full

…mpression

github-actions · 2025-12-11T02:18:05Z

@codex review

chatgpt-codex-connector · 2025-12-11T02:18:11Z

To use Codex here, create a Codex account and connect to github.

github-actions · 2025-12-11T02:34:16Z

@codex review

chatgpt-codex-connector · 2025-12-11T02:34:21Z

To use Codex here, create a Codex account and connect to github.

github-actions · 2025-12-11T02:56:48Z

@codex review

chatgpt-codex-connector · 2025-12-11T02:56:52Z

To use Codex here, create a Codex account and connect to github.

github-actions · 2025-12-11T03:14:32Z

@codex review

chatgpt-codex-connector · 2025-12-11T03:14:38Z

To use Codex here, create a Codex account and connect to github.

…n if we already checked them before state extraction.

github-actions · 2025-12-17T18:15:51Z

validator/full-node-serializer.cpp:196-222 and 299-317 – tonNode_blockBroadcastCompressedV2 and tonNode_dataFullCompressedV2 now carry the proof out-of-band (f.proof_) but the deserializers no longer enforce any size limit on it. Previously the max_decompressed_size cap applied to the combined proof+data inside the compressed blob; now an attacker can ship an arbitrarily large proof_ and we accept/move it without bounds, which can exhaust memory or blow up logging while still passing the size check on data_compressed_/block_compressed_.
validator/net/download-block-new.cpp:229-264 – when a peer sends dataFullCompressedV2 that requires state to decompress, we parse the unverified proof and immediately call wait_state_by_prev_blocks, which can trigger expensive state downloads/merges for attacker-chosen prev block IDs before any proof or signature validation. A malicious peer could spam crafted proofs to force repeated state fetches/merges and tie up resources.

github-actions · 2025-12-17T18:16:03Z

Findings

extract_prev_blocks_from_proof only handles full BlockProof (block::gen::BlockProof::Record) and fails on proof links, but shard block broadcasts/dataFull still carry proof links. All stateful-decompression paths call this helper (e.g. FullNodeShardImpl::obtain_state_for_decompression, FullNodeCustomOverlay::obtain_state_for_decompression, FullNodeFastSyncOverlay::obtain_state_for_decompression, DownloadBlockNew::got_data). For any shard block using the new state-based compressed V2 format, extract_prev_blocks_from_proof will return “invalid block proof” and the broadcast/download is dropped/aborted, making the new compression unusable for shard blocks. Ref: validator/full-node-serializer.cpp:133-159, validator/full-node-shard.cpp:864-882, validator/full-node-custom-overlays.cpp:68-86, validator/full-node-fast-sync-overlays.cpp:62-82, validator/net/download-block-new.cpp:229-264.

Notes

No other blocking issues spotted.

…lizer to distinguish validator_session and collator_node

github-actions · 2025-12-17T19:08:53Z

Two critical issues spotted in the new MERKLE_UPDATE compression handling:

validator/full-node-serializer.cpp still serializes broadcasts and full blocks in the old V2 layout (proof+data packed together and LZ4’d into compressed_; see serialize_block_broadcast around lines 20-67 and serialize_block_full around lines 120-156), but the TL schema now declares separate proof and data_compressed/block_compressed fields (tl/generate/scheme/ton_api.tl:433-466), and the new deserializers expect only one BOC root plus an external proof (validator/full-node-serializer.cpp:196-215, 250-277). Outbound messages won’t match the new schema, and the code will reject peers still speaking the old V2 (two-root) format. This is a protocol/compatibility blocker; serializers (and generated TL files) need to be updated to the new field layout and single-root compression.
In the block download path, shard state is now fetched before any cryptographic validation: validator/net/download-block-new.cpp:225-260 extracts prev blocks from the untrusted proof in a tonNode_dataFullCompressedV2 and calls wait_state_by_prev_blocks prior to verifying the proof/signatures. A malicious peer can respond with bogus dataFullCompressedV2 messages to force expensive state loads/merges and only fail later during decompression/validation. The state fetch should be gated on proof/signature verification (or at least minimal validation) to avoid this DoS vector.

No tests were added for these paths. If you want, I can draft the serializer changes for the new TL layout and add a guard in the download path to validate before fetching state.

…ip others because they aren't consistent with state.

github-actions · 2025-12-18T19:21:08Z

validator/full-node-serializer.cpp:133-152 assumes the broadcast proof is a BlockProof, but shard broadcasts/data responses carry a proof link. tlb::unpack_cell will fail for shard proofs, so extract_prev_blocks_from_proof returns an error and the new state-preload path drops shard blockBroadcastCompressedV2/dataFullCompressedV2 messages that require state (fast-sync/custom/public overlays and DownloadBlockNew). Shard blocks compressed with the new MU algorithm would never decompress. Needs to handle proof links too.
validator/full-node-serializer.cpp:188-191 builds a BlockBroadcast with static_cast<UnixTime>(f.catchain_seqno_); BlockBroadcast.catchain_seqno is CatchainSeqno (uint32). This braced init will narrowly convert a double and should not compile (and would store the wrong type if it did). Should construct with CatchainSeqno.

… to be before the promise.

github-actions · 2025-12-18T20:05:02Z

validator/full-node-serializer.cpp:133 extract_prev_blocks_from_proof unpacks only BlockProof. Shard block broadcasts/dataFull carry a BlockProofLink (ValidateBroadcast uses create_proof_link for non-mc), so unpacking fails → wait_state_by_prev_blocks never runs and every V2 shard broadcast/data-full that requires state is dropped. The new Merkle-update compression therefore cannot be used for shard blocks. Handle proof links in this helper (or select the correct parser based on shard) before requesting state.

Mustang98 added 16 commits October 13, 2025 02:40

Activate improved compression.

c6ca56f

[merkle-update-compression] New V3 interfaces for block broadcast

5d60799

[merkle-update-compression] MU compression optimizations

b1b47f4

[merkle-update-compression] New interface

81719c5

[merkle-update-compression] Add asynchronious state extraction for V3…

2d54f92

… decompression

[merkle-update-compression] Refactoring, applying new compression for…

e02a8e4

… more broadcasts.

[merkle-update-compression] Revert experimental changes in the compre…

efd9259

…ssion algorithm

Copy depth balance optimizations into improved compression

8f05ab5

[merkle-update-compression] Add left MU subtree state-based decompres…

f7a8f36

…sion.

[merkle-update-compression] Fix previous block extraction from proof …

f12c44b

…record

[merkle-update-compression] Implemented signatures check for compress…

9d03312

…ed block

[merkle-update-compression] Use updated state extraction intefrace fo…

1b130ea

…r deserialize_block_full

Merge remote-tracking branch 'upstream/testnet' into merkle-update-co…

5dc2e60

…mpression

[merkle-update-compression] style format fix

2ca0c43

[merkle-update-compression] Rename new alogirthm

3db052c

[merkle-update-compression] refactor state extraction in custom overlay

d61f037

[merkle-update-compression] Remove unnecessary includes

610e143

[merkle-update-compression] small refactoring

596ed23

[merkle-update-compression] Remove redundant includes

315dffc

Mustang98 requested a review from SpyCheese December 11, 2025 03:16

Mustang98 mentioned this pull request Dec 11, 2025

Activate improved compression #1948

Open

[merkle-update-compression] Don't check signatures after decompressio…

d73c3e0

…n if we already checked them before state extraction.

[merkle-update-compression] Remove incorrect check

847ecc2

[merkle-update-compression] Extend benchmark logs for candidate_seria…

bb83c7c

…lizer to distinguish validator_session and collator_node

[merkle-update-compression] Compress only main Merkle Update tree, sk…

edffd1d

…ip others because they aren't consistent with state.

[merkle-update-compression] Reorder bool signatures_checked parameter…

5673ce7

… to be before the promise.

Merkle update compression #1944

Are you sure you want to change the base?

Merkle update compression #1944

Uh oh!

Conversation

Mustang98 commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant