Skip to content

Conversation

metricaez
Copy link

@metricaez metricaez commented Oct 10, 2025

This is a DRAFT PR for initial review and architectural alignment

This PR implements the first milestone (M1) of a publish-subscribe mechanism for parachains, addressing issue #606.
This milestone focuses exclusively on the publishing flow, allowing parachains to publish key-value data to the relay chain for consumption by other parachains.

Subsequent milestones:

  • M2: Subscribe instruction
  • M3: Change detection optimizations

will be delivered in follow-up PRs built on top of this foundation.

Review Focus

This is a design proposal and architectural review.

Please focus your attention on:

  • Component placement and architectural decisions
  • Data flow correctness and integration patterns
  • XCM instruction design and executor integration
  • Data structure design

Not in scope for this review:

  • Weights and benchmarks (currently hardcoded, will be properly implemented in follow-up PRs)
  • Production deployment specifics
  • Final XCM version placement (see note below)

Context: Issue #606

The original issue identified the challenge of expensive inter-parachain communication. Current methods (XCM messages, off-chain protocols) are complex and inefficient for broadcasting data across multiple parachains.

The proposed solution from the discussion:

  • Parachains can publish (key, value) data to the relay chain via XCM instruction
  • Data is stored in child tries per publisher (isolated storage)
  • Other parachains can access this data via their collators through validation data inherent.
  • Efficient change detection using child trie roots

This PR implements the core publishing mechanism as discussed, following the our best interpretation of XCM Publish instruction pattern suggested by @bkchr in the issue thread.

Architecture Overview

Publishing Flow

Parachain A (Publisher) via pallet-xcm
    ↓ XCM: Publish { data: [(key, value), ...] }
Relay Chain XCM Executor
    ↓ BroadcastHandler::handle_publish()
Broadcaster Pallet
    ↓ Stores in child trie for Para A
    ↓ Emits DataPublished event
Relay Chain State (Child Trie Storage)
    ↓ Collator fetches via ParachainHost API v15
ParachainInherentData (published_data field)
    ↓ Passed to parachain runtime
Parachain B Runtime
    ✓ Data available for consumption

Components Implemented

1. XCM v5 Publish Instruction

Location: polkadot/xcm/src/v5/mod.rs

Publish { data: PublishData }

Allows parachains to publish bounded key-value data to the relay chain.

  • Type: PublishData = BoundedVec<(BoundedVec<u8, 32>, BoundedVec<u8, 1024>), 16>
  • Current Limits: Max 16 items, 32-byte keys, 1024-byte values per operation. Arbitrary values for the sake of development.

Note: The instruction is temporarily added to XCM v5 for testing and integration purposes. Final placement (potentially XCM v6) will be determined during the review process.

The instruction is intended to be called via pallet-xcm send with the proper execution buy instructions.

2. Broadcaster Pallet (Relay Chain)

Location: polkadot/runtime/parachains/src/broadcaster/

Core pallet managing published data on the relay chain.

Key features:

  • Child trie storage per publisher: Each parachain gets a deterministic child trie (ChildInfo::new_default(b"pubsub" + para_id.encode()))
  • Key tracking: PublishedKeys storage tracks all keys published by each parachain for enumeration
  • Validation: Enforces limits on items, key/value lengths, and total stored keys

Storage:
PublisherExists: Tracks which parachains have published data
PublishedKeys: Tracks all keys per publisher (for enumeration)

Traits:
PublishSubscribe: Used for exposing publish and subscribe operations for pallets to implement. Intended for pallet-broadcaster but provided a trait for possible future integrations.

Main function:
pub fn handle_publish(origin_para_id: ParaId, data: Vec<(Vec<u8>, Vec<u8>)>) -> DispatchResult

3. BroadcastHandler Trait & Adapter

Location:

  • Trait: polkadot/xcm/xcm-executor/src/traits/broadcast_handler.rs
  • Adapter: polkadot/xcm/xcm-builder/src/broadcast_adapter.rs

BroadcastHandler trait:

pub trait BroadcastHandler {
    fn handle_publish(origin: &Location, data: PublishData) -> XcmResult;
}

ParachainBroadcastAdapter:

  • Validates XCM origin
  • Extracts ParaId from XCM Location
  • Provides filtering
  • Bridges XCM executor to broadcaster pallet

4. XCM Executor Integration

Location: polkadot/xcm/xcm-executor/src/lib.rs
The executor processes Publish instructions by calling Config::BroadcastHandler::handle_publish().

5. XCM Executor Config Trait Extension

Location: polkadot/xcm/xcm-executor/src/config.rs

Added BroadcastHandler to the executor's Config trait:

pub trait Config {
    // ... existing config items
    type BroadcastHandler: BroadcastHandler;
}

This requires all XCM executors to specify their broadcast handler implementation. Provided () implementation.

6. ParachainHost Runtime API (v14 → v15)

Location:
API definition: polkadot/primitives/src/runtime_api.rs
The ParachainHost runtime API was bumped to v15 with the addition of:

#[api_version(15)]
fn get_all_published_data() -> BTreeMap<ParaId, Vec<(Vec<u8>, Vec<u8>)>>

This method returns all published data from all publishers on the relay chain, which collators use to populate the inherent data. Exposed methods:
fn get_all_published_data(para_id: ParaId) -> Vec<(Vec, Vec)>
fn get_published_value(para_id: ParaId, key: Vec) -> Option<Vec>

Design note: Currently returns ALL published data. In M2 (Subscribe), this will be filtered to only return data from parachains the subscriber is subscribed to.

7. Relay Chain Interface Extension

Location: cumulus/client/relay-chain-interface/src/lib.rs

Extended the RelayChainInterface trait with:

async fn get_all_published_data(
    &self,
    at: PHash,
) -> RelayChainResult<BTreeMap<ParaId, Vec<(Vec<u8>, Vec<u8>)>>>

Implementations:

  • RelayChainInProcessInterface - Direct runtime API call
  • RelayChainRpcInterface - RPC client call to relay chain node

This interface is used by collators in cumulus/client/parachain-inherent/src/lib.rs to fetch published data when building inherent data.

8. ParachainInherentData Extension

Location: cumulus/primitives/parachain-inherent/src/lib.rs

Added published_data field to ParachainInherentData:

pub struct ParachainInherentData {
    pub validation_data: PersistedValidationData,
    pub relay_chain_state: sp_trie::StorageProof,
    pub downward_messages: Vec<InboundDownwardMessage>,
    pub horizontal_messages: BTreeMap<ParaId, Vec<InboundHrmpMessage>>,
    pub relay_parent_descendants: Vec<RelayHeader>,
    pub collator_peer_id: Option<ApprovedPeerId>,
    pub published_data: BTreeMap<ParaId, Vec<(Vec<u8>, Vec<u8>)>>,  // NEW
}

Design rationale: This follows the same pattern as existing message types (downward_messages, horizontal_messages). A direct field in the inherent data structure.

9. InboundPublishedData Wrapper

Location: cumulus/pallets/parachain-system/src/parachain_inherent.rs

Wrapper type for published data validation:

pub struct InboundPublishedData {
    pub data: BTreeMap<ParaId, Vec<(Vec<u8>, Vec<u8>)>>,
}

Purpose: Aligns with the SDK's pattern of wrapping inbound data types (InboundDownwardMessage, InboundHrmpMessage) for consistency and future extensibility.

10. Parachain-System Integration

Location: cumulus/pallets/parachain-system/src/lib.rs

Inherent creation (set_validation_data):

  • Receives published_data from collator via inherent
  • Validates data is included in inherent
  • Makes data available to parachain runtime
    Note: This milestone only implements the data reception path. Subscription tracking and change detection optimization will come in M2 and M3.

11. Collator-Side Fetching Logic

Location: cumulus/client/parachain-inherent/src/lib.rs

Collators fetch published data when building inherent data:

let published_data = relay_chain_interface
    .get_all_published_data(relay_parent)
    .await
    .unwrap_or_default();

This data is then included in the ParachainInherentData passed to the parachain runtime.

12. Rococo Integration (Temporary)

Location: polkadot/runtime/rococo/src/lib.rs

The broadcaster pallet is integrated into Rococo relay chain for testing purposes only.

Configuration:

impl polkadot_runtime_parachains::broadcaster::Config for Runtime {
    type MaxPublishItems = MaxPublishItems;
    type MaxKeyLength = MaxKeyLength;
    type MaxValueLength = MaxValueLength;
    type MaxStoredKeys = MaxStoredKeys;
}

Rationale: Enables reviewers to test the complete flow using Zombienet (config provided in pubsub-dev/zombienet.toml). This integration may be removed or moved to different chains in the final implementation.

Testing

Relevant testing following SDK patterns have been provided to the newly added components.

Local Testing with Zombienet

A Zombienet configuration is provided in pubsub-dev/ for local testing:
cd pubsub-dev
./build.sh # Build polkadot and polkadot-parachain
zombienet spawn zombienet.toml

This spins up:

  • Rococo relay chain (4 validators)
  • Penpal parachain (2 collators)
    Note: The pubsub-dev/ directory is for review testing only and will be removed after the review process.

Extrinsics:

  • [Relay] Fund Parachain's Sovereign Account: 0x04030070617261e80300000000000000000000000000000000000000000000000000000b00407a10f35a
  • [Parachain] Publish some Data via pallet-xcm send call: 0x02003300050100050c000400000002286bee1300000002286bee003404143078313233143078313233

Known Limitations (To be addressed in follow-up PRs)

  • Hardcoded weights: All XCM and pallet weights are placeholders
  • No benchmarks: Benchmarking will be implemented in a follow-up PR
  • No subscription mechanism: M2 will add the Subscribe instruction and subscription tracking
  • No change detection: M3 will add root-based change detection to avoid unnecessary storage writes
  • No cleanup: Published data persists indefinitely (design decision to be discussed)

Closure

Please share any concerns, suggestions, or alternative approaches. This is an early-stage proposal and we welcome all input to align with the SDK's architecture and design principles.

Related: #606

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant