Skip to content

feat: sealevel composite ism#8528

Open
troykessler wants to merge 54 commits intomainfrom
feat/sealevel-composite-ism
Open

feat: sealevel composite ism#8528
troykessler wants to merge 54 commits intomainfrom
feat/sealevel-composite-ism

Conversation

@troykessler
Copy link
Copy Markdown
Contributor

@troykessler troykessler commented Apr 7, 2026

Scope

This PR includes the addition of a new composite ism, a few relayer changes to interact with the new ism type and composite deploy and read support in the solana rust cli.

Description

A single Solana program that stores an arbitrary ISM config tree in one PDA account and verifies messages via pure in-program recursive dispatch — no CPI calls to sub-ISM programs, no depth limits.

Why not just CPI into existing ISM programs?
Solana enforces a maximum CPI depth of 4. A real-world Aggregation → Routing → MultisigMessageId tree already hits the limit. More complex configs (e.g. Ethereum mainnet's Aggregation(Pausable, Routing → Aggregation(MerkleRoot, MessageId))) require depth 5+. The composite ISM avoids this entirely by keeping all verification in-program.

Tree structure example (mirrors Ethereum mainnet defaultIsm):

VAM PDA
└── Aggregation(threshold=1, [
      Pausable { paused: false },
      Routing { default_ism: None }
    ])

domain PDA (origin=8453)              domain PDA (origin=42161)
└── Aggregation(threshold=1, [        └── Aggregation(threshold=1, [
      MultisigMessageId { ... },            MultisigMessageId { ... },
      Pausable { paused: false }            Pausable { paused: false }
    ])                                    ])

Where to start reading

  1. src/accounts.rsIsmNode enum, storage types, PDA derivation helpers
  2. src/verify.rs — recursive verify_node dispatcher
  3. src/account_metas.rsVerifyAccountMetas handler (how the relayer discovers which accounts to pass)
  4. src/processor.rs — instruction entrypoints

Supported ISM node types:

Variant Behavior
TrustedRelayer Checks the relayer is a transaction signer
MultisigMessageId Inline ECDSA threshold multisig — no delegation to the multisig-ism program
Aggregation m-of-n: at least threshold sub-ISMs must have metadata and pass
Routing Dispatches to a per-domain PDA by message.origin
AmountRouting Routes to lower/upper sub-ISM based on token message amount
RateLimited Rolling 24h transfer cap; state persists in the PDA
Pausable Emergency circuit breaker

Key design decisions

MultisigMessageId is self-contained. ECDSA verification is implemented inline rather than delegating via CPI to the existing multisig-ism-message-id program. The node stores validators: Vec<H160>, threshold: u8 directly. Domain routing is handled externally by a Routing node — the multisig node is domain-agnostic.

Routing uses per-domain PDAs. Storing all domain→ISM mappings inline would OOM at ~100 domains (SVM heap limit is 256KB). Instead, each domain's ISM is stored in its own PDA at seeds [b"domain_ism", &domain.to_le_bytes()]. Only the single domain PDA for the incoming message's origin is loaded at verify time — O(1) heap regardless of domain count.

Two-pass VerifyAccountMetas for Routing. The relayer calls VerifyAccountMetas (simulation-only) to learn which accounts to include in the Verify transaction. For Routing nodes, sub-accounts (e.g. a TrustedRelayer's signer pubkey) are only discoverable after reading the domain PDA — which itself must be in the account list first. The relayer loops until the account set converges:

pass 1: [storage_pda]              → [storage_pda, domain_pda]
pass 2: [storage_pda, domain_pda]  → [storage_pda, domain_pda]  ← stable

This fixpoint loop is implemented in chains/hyperlane-sealevel/src/mailbox.rs.

RateLimited inside domain PDAs. State (filled_level, last_updated) lives inline in the IsmNode. After verification, the updated node is written back to the domain PDA if it is marked writable. VerifyAccountMetas detects RateLimited in a domain PDA and marks it writable automatically.


Limitations

  • At most one Routing node per deployment. Rejected at UpdateConfig time (MultipleRoutingNodes error).
  • No Routing inside a domain PDA. SetDomainIsm rejects trees containing Routing (RoutingInDomainIsm error). Domain PDAs are leaves.
  • No cross-program delegation. There is no IsmNode variant that calls an external ISM program. The full tree must fit in one deployment.
  • No MultisigMerkleRoot variant. The MerkleRoot multisig requires a merkle proof in the verify metadata so the on-chain program can re-derive the message ID from the tree root. Hyperlane uses a fixed-depth-32 incremental Merkle tree, so the proof is always 32 × 32 = 1024 bytes. Combined with the 68-byte header, that leaves room for at most zero signatures before hitting the 1232-byte Solana packet limit — threshold=1 alone produces 1157 bytes, threshold=3 produces 1287 bytes (already over the limit before tx overhead). MultisigMessageId sidesteps this entirely: validators sign (checkpoint, message_id) and the on-chain program recomputes message_id = hash(message) from the message body already present in the transaction.

Solana-specific scale limits

Validated by tests/functional_big_isms.rs, which runs the compiled BPF binary so real constraints apply (32 KB heap, 64-frame call depth, 1.4M CU budget).

Limit Constraint Result
Compute budget 1,400,000 CU 200 domains + Routing + Aggregation(2) + 3 secp256k1 recoveries: ~135k CU (90% headroom)
Heap 32 KB 50-sub-ISM Aggregation (wide fan-out): ~15k CU, no heap errors
Call depth 64 BPF frames 16 levels of nested Aggregation: ~9k CU, well within limit
UpdateConfig realloc PDA must stay rent-exempt Grow from TestAggregation(50): works; payer must pre-fund the extra rent

Transaction size is the binding constraint for deep multisig trees.

The 1232-byte Solana packet limit is a UDP/network-layer constraint — it is not enforced by solana-program-test (neither simulate_transaction nor process_transaction check packet size). The scale tests assert it explicitly by checking the serialized transaction length.

A 3×3 nested multisig tree (Agg(3)[Agg(3)[MultisigMessageId(3v,3)] ×3] ×3) produces 2463 bytes of verify metadata, which exceeds the 1232-byte limit. The ISM logic still executes correctly under BPF constraints (~1.14M CU), but the Verify transaction cannot be submitted on mainnet as a single packet.

Practical sizing guide for MultisigMessageId metadata (3 sigs, 65 bytes each):

  • Single-node: 263 bytes (32 + 32 + 4 + 65×3)
  • Agg(3)[MultisigMessageId(3v,3) ×3]: 813 bytes — fits in one tx
  • Agg(3)[Agg(3)[MultisigMessageId(3v,3)] ×3] ×3: 2463 bytes — requires chunked delivery

Testing

File What it covers
tests/functional_composite.rs Program lifecycle: init, UpdateConfig, TransferOwnership, GetOwner, Type
tests/functional_multisig.rs Inline ECDSA multisig — valid/invalid signatures, threshold enforcement, account metas
tests/functional_aggregation.rs m-of-n threshold, partial metadata, sub-ISM rejection, account meta union
tests/functional_routing.rs Per-domain PDA create/update/close, default ISM fallback, O(1) dispatch
tests/functional_trusted_relayer.rs Signer check, invalid relayer, is_signer=false rejection, VAM returns signer account
tests/functional_pausable.rs Unpaused accepts, paused rejects, no extra accounts
tests/functional_rate_limited.rs Capacity decrement, exhaustion, partial/full 24h refill
tests/functional_amount_routing.rs Lower/upper branching, boundary conditions, short body rejection
tests/functional_test_ism.rs Trivial accept/reject baseline
tests/functional_nested.rs Every ISM type nested inside a Routing domain PDA — exercises the full two-pass VAM + verify path; RateLimited state persistence across transactions
tests/functional_big_isms.rs BPF scale tests: 200 domains, deep nesting, wide aggregation, multisig compute budget, metadata tx size, large domain ISMs — run with compiled .so binary for real BPF constraints
run-locally/sealevel/composite_ism.rs Full e2e with real relayer + validator against the same tree structure as Ethereum mainnet defaultIsm

Backward compatibility

No existing on-chain deployments are affected. The IsmNode Borsh encoding is new — no migration needed.


Open with Devin

devin-ai-integration[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Collaborator

@paulbalaji paulbalaji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review (post-fix commits)

The fix commits at 4c801d7 addressed all previously raised issues — nice work. Two new items remain.

Prior Issues — All Resolved

Confirmed fixed: read-only domain-PDA rate-limit bypass, getter/handle identity leak (now passes None with clear comment at line 345-347), verify-account-meta dedup mismatch, duplicate validator validation, CLI RateLimited round-trip, Routing.default_ism fallback metas, and missing system-program validation in set_domain_ism.

Comment on lines +577 to 590
let additional_signers: Vec<&SealevelKeypair> = self
.get_signer_if_separate()
.filter(|signer| {
payload
.instruction
.accounts
.iter()
.any(|meta| meta.pubkey == signer.pubkey() && meta.is_signer)
})
.into_iter()
.collect();
let tx = self
.provider
.build_estimated_tx_for_instruction(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HIGH: Relayer co-signs verify for any recipient-selected ISM

The getter/handle identity leak is fixed (line 345 now passes None), but the verify path still exposes the identity to any ISM the recipient picks:

  1. get_process_payload() (line ~347) calls get_recipient_ism() — the recipient controls which ISM address is returned
  2. get_ism_verify_account_metas() (line 254) passes identity to sanitize_dynamic_accounts for that ISM, preserving signer authority
  3. Here at line 577-590, process() adds the identity as an additional_signer if it appears signer-marked in the final instruction

A malicious recipient can point at a custom ISM program whose VerifyAccountMetas returns the relayer identity as a signer. The relayer then co-signs the process tx, and the malicious ISM (reached via mailbox CPI) could transfer SOL out of the identity account.

Fix options:

  • Only pass identity for ISMs that return a known ModuleType (e.g. Composite), since TrustedRelayer only exists in the composite ISM program
  • Allowlist specific ISM program IDs that may request the identity signer
  • Only co-sign if the ISM program ID matches a configured composite ISM address

Copy link
Copy Markdown
Contributor Author

@troykessler troykessler Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually as I'm thinking about it we dont need this. The identity keypair is an additional keypair next to the actual fee payer keypair which holds no funds at all and simply signs to proof the relayer identity. So even if a malicious ism obtains a co-signature, it can steal no funds or could do anything with it. Added more detailed comments in the code regarding this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be interested in @tkporter opinion here

Comment thread rust/main/lander/tests/integration_sealevel.rs Outdated
coderabbitai[bot]

This comment was marked as resolved.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.23%. Comparing base (cd69856) to head (206192b).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8528   +/-   ##
=======================================
  Coverage   79.23%   79.23%           
=======================================
  Files         143      143           
  Lines        4242     4242           
  Branches      428      428           
=======================================
  Hits         3361     3361           
  Misses        853      853           
  Partials       28       28           
Components Coverage Δ
core 87.80% <ø> (ø)
hooks 78.11% <ø> (ø)
isms 81.46% <ø> (ø)
token 87.98% <ø> (ø)
middlewares 87.76% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 14 additional findings in Devin Review.

Open in Devin Review

Comment on lines +196 to +223
let mut packed_count = 0usize;
let mut final_results: Vec<Option<Vec<u8>>> = vec![None; ism_count];

for (i, (is_null, result)) in sub_results.iter().enumerate() {
if packed_count >= threshold_usize {
break;
}
if !is_null {
if let Some(bytes) = result {
final_results[i] = Some(bytes.clone());
packed_count = packed_count.saturating_add(1);
}
}
}

if packed_count < threshold_usize {
for (i, (is_null, result)) in sub_results.iter().enumerate() {
if packed_count >= threshold_usize {
break;
}
if *is_null {
if let Some(bytes) = result {
final_results[i] = Some(bytes.clone());
packed_count = packed_count.saturating_add(1);
}
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Aggregation metadata builder may select a failing Null sub-ISM over a passing one

In sealevel_composite.rs:196-223, the aggregation metadata builder packs Null sub-ISMs as fallback padding when non-Null sub-ISMs don't meet threshold. It iterates sub_results in order and picks the first Null sub-ISMs it finds. Since MetadataSpec::Null carries no state (paused/unpaused, accept/reject), the relayer cannot distinguish a Pausable{paused:true} from a Pausable{paused:false}. If the first Null sub-ISM in iteration order happens to be paused, the relayer packs it instead of the unpaused one, causing on-chain verification to fail even though a valid combination exists.

This only matters when: (1) threshold < sub_isms.len(), (2) non-Null successes < threshold, and (3) multiple Null sub-ISMs exist with different verify outcomes. In practice this is rare (Pausable is an emergency circuit breaker, rarely paused), and the relayer has no way to resolve it without a richer MetadataSpec. Not a code bug but a design limitation worth documenting.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

run: cargo test --all-targets --features aleo,integration_test
working-directory: ./rust/main
- name: Build sealevel SBF programs
run: cargo build-sbf --manifest-path programs/ism/composite-ism/Cargo.toml
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentional? do we not build this anywhere else in CI?

I would think that in the sealevel e2e tests we would at least

fwiw I'd be comfortable without this in CI and we can try to merge #8064 instead as a path for testing this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will review your pr regarding this

Comment thread rust/sealevel/programs/ism/composite-ism/src/accounts.rs Outdated
Comment thread rust/sealevel/programs/ism/composite-ism/src/accounts.rs
Comment thread rust/sealevel/programs/ism/composite-ism/src/accounts.rs
Comment thread rust/sealevel/programs/ism/composite-ism/src/accounts.rs Outdated
Comment thread rust/sealevel/libraries/multisig-ism/src/metadata.rs
Comment thread rust/sealevel/programs/ism/composite-ism/src/processor.rs
Comment thread rust/sealevel/programs/ism/composite-ism/src/processor.rs Outdated
Comment thread rust/sealevel/programs/ism/composite-ism/src/verify.rs Outdated
/// `accounts_iter` is advanced for nodes that require on-chain accounts:
/// - `TrustedRelayer`: pops the relayer signer account.
/// - `Routing`: pops the domain PDA account, then may pop sub-accounts.
pub(crate) fn verify_node<'a, 'b, I>(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be really nice if we were able to have an ISM that does a CPI into the Mailbox's default ISM. We do this in a bunch of places and it'd be nice to not need to express a copy of the default ISM in here. What would be the lift to do this? I assume to get the account metas things can get possibly hairy?

I think CPI-depth wise it should be okay:

Mailbox = depth 1
Custom Composite ISM = depth 2
Default ISM = depth 3

We'd want to read the default ISM from the Mailbox's Inbox PDA (instead of making an unnecessary CPI into the Mailbox to get the default ISM)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so in evm we only fallback to the mailbox default ism in the fallback routing ism, would we do the same here? then I would probably remove the default ism from the current routing ism and introduce a new ism type FallbackRouting so we have the exact same setup as evm. Will look into this, but yeah VAM will be tricky

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think following the behavior of DefaultFallbackRoutingIsm, but also having a normal DomainRoutingIsm behavior would be required

so maybe have the routing ISM have a configurable boolean of whether to fallback to the mailbox default ISM?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

came up with this solution now, we would have a normal RoutingIsm and a FallbackRoutingIsm, only restriction is that the fallback ism here has to also be a composite ism, else it is super difficult with the account metas

#8644

@tkporter
Copy link
Copy Markdown
Member

still reviewing but sending this off for now!

@hyper-gonk
Copy link
Copy Markdown
Contributor

hyper-gonk Bot commented Apr 21, 2026

🦀 Rust Agent Docker Image Built Successfully

Service Tag
agent 137ac33-20260421-120235
Full image paths
ghcr.io/hyperlane-xyz/hyperlane-agent:137ac33-20260421-120235

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

3 participants