-
Notifications
You must be signed in to change notification settings - Fork 1.1k
V3 Candidate Descriptor Support with Explicit Scheduling Parent + node feature #10472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: rk-prospective-parachains-cleanup
Are you sure you want to change the base?
V3 Candidate Descriptor Support with Explicit Scheduling Parent + node feature #10472
Conversation
8c529fe to
1588d71
Compare
21db3df to
d66e38a
Compare
+ Fix a typo.
This change works towards supporting for V3 candidate descriptors which allow the scheduling parent (the relay block used for core assignment) to differ from the relay parent (the block the parachain builds on). This is a prerequisite for low-latency collation. Key changes: collation-generation: - Add comprehensive module documentation explaining the two modes of operation (CollatorFn callback vs SubmitCollation message) and V2/V3 descriptor differences - Pass scheduling_parent through to construct_and_distribute_receipt() - Create V3 descriptors when scheduling_parent is Some, V2 otherwise candidate-backing: - Rename PerRelayParentState to PerSchedulingParentState to reflect that state is now keyed by scheduling parent, not relay parent - Store session_index in PerSchedulingParentState for V1 fallback (where session is not in the descriptor) - Fetch executor_params on-demand using session from descriptor (V2/V3) or from scheduling parent state (V1 fallback), rather than storing it per scheduling parent - Simplify core_index_from_statement() to take PerSchedulingParentState prospective-parachains: - Add tests for V3 candidate descriptor handling primitives: - Add new_v3() constructor for CandidateDescriptorV2 with explicit scheduling_parent parameter
4f2c139 to
f288d35
Compare
This commit introduces several related improvements to the backing and
validation subsystems:
1. Add BackableCandidateRef struct
- Replaces bare (CandidateHash, Hash) tuples with type-safe struct
- Explicitly names scheduling_parent field for clarity
- Prevents accidental field swapping or wrong hash usage
2. Convert subsystem messages to named enum fields
- CandidateBackingMessage::GetBackableCandidates
- CandidateBackingMessage::Second
- CandidateBackingMessage::Statement
- ProspectiveParachainsMessage::GetBackableCandidates
- Improves code readability and IDE support
3. Fix scheduling parent terminology
- Rename candidate_relay_parent → candidate_scheduling_parent in
CanSecondRequest
- Fix variable naming: relay_parent → scheduling_parent where
semantically correct
- Update comments and logs to use accurate terminology
- Distinguish between execution context (relay_parent) and scheduling
context (scheduling_parent)
4. Add ValidationContext struct to PVF subsystem
- Encapsulates candidate receipt, PVD, PoV, and execution params
- Provides helper methods for accessing relay_parent and
scheduling_parent
- Reduces parameter explosion in validation code paths
- ExecuteRequest now includes scheduling_parent and
descriptor_version
5. Update fragment chain to track V3 scheduling_parent
- CandidateEntry now stores both relay_parent and scheduling_parent
- Validates relay_parent ancestry while using scheduling_parent for
group assignment
- Adds v3_enabled parameter to candidate entry creation
All changes are internal to the node - no network protocol changes. This
prepares the codebase for proper V3 candidate handling where
relay_parent (execution) and scheduling_parent (scheduling) can differ.
scheduling_parent - Add v3_collation protocol imports for V3 AdvertiseCollation messages - Add version field to PeerData for protocol version negotiation - Rename PerRelayParent -> PerSchedulingParent throughout - Add v3_enabled flag to PerSchedulingParent from node_features - Update PendingCollation to track advertised_descriptor_version for V3 - Unified PendingCollation::new and new_v3 into single constructor - Fix borrow checker issues by passing CollationVersion directly - Update all tests to use V3 protocol where appropriate
ValidationParamsExtension This commit introduces the concept of scheduling_parent as distinct from relay_parent (execution parent) across node subsystems and extends the PVF interface to pass both hashes for V3+ candidates. For V1/V2 candidates: scheduling_parent == relay_parent (implicitly) For V3 candidates: scheduling_parent may differ from relay_parent The scheduling_parent determines: - Which validator group is assigned to back the candidate - Which per-parent state to use for candidate tracking - The context for claim queue lookups and validator assignments The relay_parent determines: - The execution context (relay chain block state) - Parent head data and storage root Add ValidationParamsExtension for V3+ candidates: - New versioned enum appended to ValidationParams encoding - V3 variant contains both relay_parent and scheduling_parent hashes - TrailingOption wrapper enables backward compatibility with V1/V2 - PVFs decode extension only if bytes remain (V3), otherwise None (V1/V2) - Add comprehensive safety warnings to TrailingOption about its constraints This allows PVFs to distinguish between scheduling and execution contexts starting with V3 candidates. CandidateBackingMessage: - GetBackableCandidates: Introduce BackableCandidateRef type with candidate_hash + scheduling_parent, convert to struct variant - Second: Convert to struct variant with explicit scheduling_parent field - Statement: Add scheduling_parent field to track backing context - CanSecondRequest: Rename candidate_relay_parent → candidate_scheduling_parent CollationSecondedSignal: - Rename relay_parent → scheduling_parent with clarifying documentation SubmitCollationParams: - Add optional scheduling_parent field for V3 descriptor creation statement-distribution/v2: - Rename PerRelayParentState → PerSchedulingParentState - Rename per_relay_parent map → per_scheduling_parent - Update all lookups to use scheduling_parent as the key - Update comments distinguishing scheduling vs execution context - Refactor descriptor version from two booleans to CandidateDescriptorVersionConfig enum (V1/V2/V3 variants) eliminating invalid state combinations - Remove obsolete CandidateReceiptV2 feature flag checks (19 instances, 114 lines) V2 is now always accepted regardless of feature flag (graduated in commit 4cdf77e) - Update paras_inherent filtering documentation - Add comments in grid tests clarifying relay_parent serves dual role for V1/V2 - Fix indentation in CandidateBackingMessage::Statement pattern matches This is a preparatory refactoring. V1/V2 behavior is unchanged: - ValidationParams encoding unchanged (extension appended only for V3) - V
|
All GitHub workflows were cancelled due to failure one of the required jobs. |
|
All GitHub workflows were cancelled due to failure one of the required jobs. |
| // must all be 0 by accident to cause any issues. Bitcoin hardest | ||
| // difficulty so far has been 24 digits/12 bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's up with Bitcoin ? :)
| /// return V3 descriptors. When `false`, the function preserves pre-V3 | ||
| /// behavior for backwards compatibility - see explanation above. | ||
| pub fn version(&self, v3_enabled: bool) -> CandidateDescriptorVersion { | ||
| let old_v1_detected = self.reserved2 != [0u8; 32] || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be v1, not really old v1.
| /// via node features. When `true`, the function will properly detect and | ||
| /// return V3 descriptors. When `false`, the function preserves pre-V3 | ||
| /// behavior for backwards compatibility - see explanation above. | ||
| pub fn version(&self, v3_enabled: bool) -> CandidateDescriptorVersion { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this flag ? The primitive is not aware of feature flags, and the versioning information is contained fully in it.
| pub(super) core_index: u16, | ||
| /// The session index of the candidate relay parent. | ||
| session_index: SessionIndex, | ||
| /// Session index for determining secondary checkers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before fixing session boundaries session_index should be equal scheduling_session_offset
| /// The root of a block's erasure encoding Merkle tree. | ||
| erasure_root: Hash, | ||
| /// The relay chain block determining scheduling. | ||
| scheduling_parent: H, // Introduced in v3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we running out of space, one more hash left to add 😢
| candidate: &BackedCandidate<T::Hash>, | ||
| allowed_relay_parents: &AllowedRelayParentsTracker<T::Hash, BlockNumberFor<T>>, | ||
| allow_v2_receipts: bool, | ||
| v3_enabled: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also update fn name as it now also checks v3.
| // Check if session index is equal to current session index. | ||
| if session_index != shared::CurrentSessionIndex::<T>::get() { | ||
| // Check if scheduling session is equal to current session index. | ||
| if scheduling_session != shared::CurrentSessionIndex::<T>::get() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also check candidate.descriptor().session_index(). For now they should still be the same.
iulianbarbu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a check in review - local zn-sdk testing is underway.
| v3_enabled, | ||
| scheduling_parent, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels redundant to have both. Can't we assume v3 enabled if scheduling_parent.is_some()?
| self.candidate_hash() | ||
| } | ||
|
|
||
| // Uses default implementation: returns relay_parent() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not that useful, can be removed
| // Uses default implementation: returns relay_parent() |
| vec![notification.into()], | ||
| metrics, | ||
| ) | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should handle a message for V3 here too. Logs during zn-sdk tests are showing the Major logic bug. Peer somehow has unsupported collation protocol version., which is not shown anymore after this fix.
diff --git a/polkadot/node/network/bridge/src/rx/mod.rs b/polkadot/node/network/bridge/src/rx/mod.rs
index 7a8f2e3133..b00956dfd9 100644
--- a/polkadot/node/network/bridge/src/rx/mod.rs
+++ b/polkadot/node/network/bridge/src/rx/mod.rs
@@ -587,6 +587,15 @@ async fn handle_collation_message<AD>(
metrics,
)
} else if expected_versions[PeerSet::Collation] == Some(CollationVersion::V2.into())
+ {
+ handle_peer_messages::<protocol_v2::CollationProtocol, _>(
+ peer,
+ PeerSet::Collation,
+ &mut shared.0.lock().collation_peers,
+ vec![notification.into()],
+ metrics,
+ )
+ } else if expected_versions[PeerSet::Collation] == Some(CollationVersion::V3.into())
{
handle_peer_messages::<protocol_v2::CollationProtocol, _>(
peer, | } | ||
| } | ||
| }, | ||
| NetworkBridgeTxMessage::SendRequests(reqs, if_disconnected) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably something must be handled here accordingly for V3 CollationVersion. I am seeing such logs in the local testing, while parachain finalization is not happening and collations expire. Will try tomorrow to address it.
2026-01-28 17:44:48.032 DEBUG tokio-runtime-worker parachain::collator-protocol: [Relaychain] Collation was advertised but not requested by any validator. candidate_hash=0x94b8d1bf7709355052dca13d8ca33018d46956ba2133b185e20a72d02de13f06 pov_hash=0x2766fcabaee5da227b317ff0f41222e586bd4191d7e93a3c2d658507ee258870 traceID=197685380191164009165270678908973559832
Overview
This PR introduces V3 candidate descriptors with an explicit
scheduling_parentfield, separating the scheduling context (which determines validator group assignment) from the execution context (which determines parachain state). This is a critical foundation for enabling lookahead scheduling and improving parachain block production flexibility in async backing.Key Innovation: V3 candidates can be scheduled based on a different relay chain block than the one they execute against, enabling validators to assign backing responsibilities ahead of time while maintaining correct execution semantics.
Problem
In V1 and V2 candidate descriptors, the
relay_parentfield serves a dual purpose:This tight coupling limits flexibility:
Solution
V3 candidates introduce an explicit
scheduling_parentfield that decouples these concerns:relay_parent: Determines execution context (parachain state root, claim queue state for core assignment)scheduling_parent: Determines validator group assignment (which validators back this candidate)For backward compatibility:
scheduling_parent == relay_parent(implicit, behavior unchanged)scheduling_parentcan differ fromrelay_parent(explicit field in descriptor)This separation enables lookahead scheduling where parachains can be assigned to validator groups on future relay chain blocks while still executing against older state.
Key Changes
Primitives (
polkadot/primitives/src/v9/mod.rs)CandidateDescriptorVersion::V3: New enum variant for version detectionCandidateDescriptorV2::new_v3(): Constructor accepting explicitscheduling_parentparameterscheduling_parent(v3_enabled: bool) -> Hash: Accessor returning scheduling_parent for V3, relay_parent for V1/V2scheduling_session(v3_enabled: bool) -> Option<SessionIndex>: Accessor for scheduling session (offset-based for V3)CandidateReceiptV3node feature) to distinguish V1 from V3 using reserved fieldsPVF Extension (
polkadot/parachain/src/primitives.rs)ValidationParamsExtension::V3: New extension type containing bothrelay_parentandscheduling_parenthashesTrailingOption<T>: Backward-compatible wrapper that decodesTfrom trailing bytes if present, orNoneif at EOFSubsystem Messages (
polkadot/node/subsystem-types/src/messages.rs)BackableCandidateRef: New struct containingcandidate_hashandscheduling_parent(replaces bareCandidateHash)CandidateBackingMessage::Second: Now includes explicitscheduling_parent: HashparameterCandidateBackingMessage::Statement: Now includes explicitscheduling_parent: HashparameterCandidateBackingMessage::GetBackableCandidates: UsesVec<BackableCandidateRef>instead ofVec<CandidateHash>CanSecondRequest: Includescandidate_scheduling_parentfield for validator group lookupCore Subsystems
Candidate Backing (
polkadot/node/core/backing/)scheduling_parent(notrelay_parent) for validator group lookupsCandidate Validation (
polkadot/node/core/candidate-validation/)ValidationParamsExtension::V3bytes to PVF validation inputTrailingOptionpatternProspective Parachains (
polkadot/node/core/prospective-parachains/)scheduling_parenthashscheduling_parentis in active leaves before accepting candidatesNetwork Protocol
Collator Protocol (
polkadot/node/network/collator-protocol/)PendingCollationandFetchedCollationnow includescheduling_parentfieldscheduling_parentmatches fetched descriptor's actualscheduling_parent(v3_enabled)scheduling_parentis an active leaf before accepting collationsStatement Distribution (
polkadot/node/network/statement-distribution/src/v2/)PerRelayParentState→PerSchedulingParentStateper_relay_parentmap →per_scheduling_parentscheduling_parentas key for state lookups (validator groups, candidate tracking)relay_parentserves dual role for V1/V2Test Infrastructure
CandidateReceiptV2node feature checks (V2 now enabled everywhere)v3_descriptors_are_accepted_when_enabled: V3 with UMP signals acceptedv3_descriptors_without_ump_signals_are_rejected: V3 without UMP signals rejectedv3_descriptors_rejected_as_v1_when_disabled: V3 rejected as V1 when feature disabledBackward Compatibility
Multiple layers of protection ensure safe gradual rollout:
Node Feature Gating: V3 only recognized when
CandidateReceiptV3node feature enabled (requires 2/3+ validator upgrade)Mandatory UMP Signals: V3 candidates MUST include UMP signals (
SelectCoreat minimum)TrailingOption Pattern: PVF extension gracefully handled
Version Detection: Backwards compatible logic distinguishes V1 from V3
version == 1(vs V2'sversion == 0)Runtime Protection: Runtime drops candidates violating version-specific rules
Review Focus
High Priority - Correctness
Version detection logic (
polkadot/primitives/src/v9/mod.rs:CandidateDescriptorV2::version())TrailingOption safety (
polkadot/parachain/src/primitives.rs,polkadot/node/core/pvf/execute-worker/)Scheduling_parent validation (
polkadot/node/network/collator-protocol/,polkadot/node/core/backing/)UMP signal enforcement (
polkadot/runtime/parachains/src/paras_inherent/mod.rs)Medium Priority - Architecture
Message flow (
polkadot/node/subsystem-types/src/messages.rs)scheduling_parentcorrectly threaded through subsystem messagesBackableCandidateRefused consistentlyState tracking (
polkadot/node/core/backing/,polkadot/node/network/statement-distribution/)PVF extension encoding/decoding (
polkadot/node/core/candidate-validation/)Lower Priority - Cleanup
paras_inherent/tests.rs)per_relay_parent→per_scheduling_parent)builder.rs)CI Coverage
CI verifies:
Critical Invariants
scheduling_parent, neverrelay_parent(for V3)scheduling_parentmust be in validator's active leavesSelectCoresignal matches claim queue assignmentTrailingOptiondecodes as NoneRelated
rk-prospective-parachains-cleanup) - used for separate reviewScope: ~4,500 lines added, ~2,350 lines removed across 58 files