feat: relayer api by troykessler · Pull Request #8364 · hyperlane-xyz/hyperlane-monorepo

troykessler · 2026-03-18T10:30:17Z

Summary

Added protocol-specific transaction hash parsing for the relayer API, enabling push-based message relaying for faster cross-chain message delivery.

Motivation & Benefits

Faster Message Delivery (Push vs Poll)

Before (Poll-based):

Users dispatch a message on the origin chain
Wait for relayer to poll and index the block containing their transaction
Indexing delay depends on block finality + polling interval (~30s to several minutes)

After (Push-based with this API):

Users dispatch a message and immediately call /relay with their tx hash
Relayer fetches and processes the message on-demand
Near-instant delivery - no waiting for polling cycle

Multi-Protocol Support Problem

The relayer API accepts tx hashes as strings but previously used generic hex decoding. This failed for non-hex formats:

Solana: Uses base58 encoding (e.g., 5Hm...3Bn)
Radix: Uses bech32m encoding (e.g., txid_rdx1...)
Aleo: Uses bech32m encoding (e.g., at1...)

Without protocol-specific parsing, the push-based API was effectively broken for these chains, forcing users to wait for the slow polling mechanism.

Impact

With this change:

✅ Push-based relaying now works for all chains (EVM, Cosmos, Solana, Radix, Aleo)
✅ Significantly faster message delivery when users call the API
✅ Improved UX - users can trigger relaying immediately after dispatching

Changes

Core Trait Extension

Extended Indexer trait with parse_tx_hash(&str) -> ChainResult<H512> method
Default implementation handles hex-encoded hashes (EVM, Cosmos, Tron, Starknet)
Strips 0x prefix, decodes hex, pads to 64 bytes

Protocol-Specific Implementations

Solana (mailbox_indexer.rs):

Added parse_sealevel_tx_hash() utility function
Decodes base58 signatures using solana_sdk::bs58

Radix (utils.rs):

Added parse_radix_tx_hash() utility function
Decodes bech32m tx hashes (txid_rdx1...)

Aleo (utils.rs):

Added parse_aleo_tx_hash() utility function
Decodes bech32m tx hashes (at1...)
Added bech32 dependency to Cargo.toml

Relayer API Integration

Updated extract_message() to use indexer.parse_tx_hash() instead of generic hex decoding
Protocol-specific parsing now happens automatically based on the chain

Testing

✅ All packages compile successfully with clippy
✅ Verified implementations across:

hyperlane-core
relayer
hyperlane-sealevel
hyperlane-radix
hyperlane-aleo

Protocol Support Matrix

Protocol	TX Hash Format	Example	Implementation
EVM chains	Hex (`0x...`)	`0x123...abc`	Default trait impl
Cosmos	Hex	`0x456...def`	Default trait impl
Tron	Hex	`0x789...ghi`	Default trait impl
Starknet	Hex (felt)	`0xabc...123`	Default trait impl
Solana	Base58	`5Hm...3Bn`	`parse_sealevel_tx_hash()`
Radix	Bech32m	`txid_rdx1...`	`parse_radix_tx_hash()`
Aleo	Bech32m	`at1...`	`parse_aleo_tx_hash()`

Example Usage

# EVM - hex with 0x prefix (push-based relay)
curl -X POST /relay \
  -d '{"origin_chain": "ethereum", "tx_hash": "0x1234..."}'

# Solana - base58 (now works with push-based relay!)
curl -X POST /relay \
  -d '{"origin_chain": "sealevel", "tx_hash": "5Hm...3Bn"}'

# Radix - bech32m (now works with push-based relay!)
curl -X POST /relay \
  -d '{"origin_chain": "radix", "tx_hash": "txid_rdx1..."}'

# Aleo - bech32m (now works with push-based relay!)
curl -X POST /relay \
  -d '{"origin_chain": "aleo", "tx_hash": "at1..."}'

Design Decisions

Trait default method: Provides hex decoding for 90% of chains without explicit overrides
Utility functions: Avoids duplication between dispatch/delivery indexers
Error handling: Clear error messages for invalid formats (e.g., "Invalid base58 tx hash")
Backward compatible: Existing hex-based chains continue working without changes

Summary by CodeRabbit

New Features
- Relay API for extracting/submitting messages with rate limiting, replay protection, metrics and observable responses (enabled by default; can be disabled)
- Broader chain support and improved transaction-hash parsing for Aleo, Radix, Sealevel and others
- Relay benchmarking script to measure end-to-end relay times
Tests
- Unit tests for new hash parsing utilities
Configuration
- New relay API rate-limit settings and local config samples

coderabbitai · 2026-03-18T10:48:35Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a Relay API (Axum handlers, extractor, metrics) with rate‑limiting and tx deduplication, integrates it into the relayer/server wiring, extends Indexer trait with parse_tx_hash plus chain-specific parsers/tests, updates cargo/workspace deps, adds relay benchmark script and local config.

Changes

Cohort / File(s)	Summary
Relay API Core `rust/main/agents/relayer/src/relay_api/mod.rs`, `rust/main/agents/relayer/src/relay_api/extractor.rs`, `rust/main/agents/relayer/src/relay_api/handlers.rs`, `rust/main/agents/relayer/src/relay_api/metrics.rs`	New relay_api module: extractor exposes `ExtractedMessage` and `extract_messages`; Axum POST /relay handler with RateLimiter and `TxHashCache`, extraction, ISM classification, enqueueing, DB persistence, HTTP error mapping; Prometheus metrics via `RelayApiMetrics`.
Server & Relayer Integration `rust/main/agents/relayer/src/lib.rs`, `rust/main/agents/relayer/src/server/mod.rs`, `rust/main/agents/relayer/src/relayer.rs`	Expose `relay_api`; Server/Relayer gain builder fields/methods for relay send channels, indexers, tx hash cache, metrics, rate limiter; wiring into router; env toggles to disable indexing or relay API; new Relayer settings fields for rate-limit config.
Indexer Trait & Default Parse `rust/main/hyperlane-core/src/traits/indexer.rs`	Adds default `Indexer::parse_tx_hash` (hex parsing, validations, left-pad to H512) with unit tests covering hex/no-0x, short/long, invalid, empty inputs.
Chain-specific parsers & tests `rust/main/chains/hyperlane-aleo/...`, `rust/main/chains/hyperlane-radix/...`, `rust/main/chains/hyperlane-sealevel/src/mailbox_indexer.rs`	Add chain-specific `parse_tx_hash` implementations and helpers: Aleo (bech32 → H512), Radix (bech32m → H512), Sealevel (base58 → H512); validations, padding semantics, and unit tests.
Cargo / Workspace updates `rust/main/agents/relayer/Cargo.toml`, `rust/main/chains/hyperlane-aleo/Cargo.toml`	Add dependencies: `tower-http` (cors); new chain crates as path deps (hyperlane-aleo optional, hyperlane-cosmos/fuel/radix/starknet/tron); enable `hex.workspace` and `bech32.workspace`.
DB & Core tweaks `rust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs`, `rust/main/hyperlane-core/src/chain.rs`	On duplicate-message early-return, RocksDB now updates max-seen message nonce before returning; change discriminant of `KnownHyperlaneDomain::Test4` (31337 → 9913374).
Config, scripts & tooling `rust/main/config/local_config.json`, `scripts/measure-relay-time.sh`	Add local_config.json with two local Ethereum chains; new `measure-relay-time.sh` to benchmark relay times and optionally call the Relay API, collecting stats.
Misc & Tests `rust/main/chains/hyperlane-aleo/src/utils/tests.rs`, `rust/main/chains/hyperlane-radix/src/utils.rs`, `.spellcheck/typos.toml`, `rust/main/agents/relayer/src/relayer/tests.rs`	Add parser unit tests for Aleo/Radix/Sealevel; update spellcheck excludes; tests updated to include new RelayerSettings rate-limit fields.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant Handler as RelayHandler
    participant RL as RateLimiter
    participant Cache as TxHashCache
    participant Extractor as Extractor
    participant Indexer as ChainIndexer
    participant Classifier as ISMClassifier
    participant DB as OriginDB
    participant Queue as SendQueue

    Client->>Handler: POST /relay (origin_chain, tx_hash)
    Handler->>RL: check()
    RL-->>Handler: allowed/denied
    Handler->>Cache: check_and_insert(chain, tx_hash)
    Cache-->>Handler: ok/duplicate
    Handler->>Extractor: extract_messages(indexers, chain, tx_hash)
    Extractor->>Indexer: parse_tx_hash(tx_hash)
    Indexer-->>Extractor: H512
    Extractor->>Indexer: fetch_logs(parsed_hash)
    Indexer-->>Extractor: [HyperlaneMessage]
    Extractor-->>Handler: ExtractedMessage[]
    Handler->>Classifier: classify(recipient)
    Classifier-->>Handler: MessageContext
    Handler->>DB: persist message + mappings
    DB-->>Handler: ok
    Handler->>Queue: enqueue PendingMessage (dest_domain)
    Queue-->>Handler: queued
    Handler-->>Client: 200 OK (messages info)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

A wee API creeps out from marsh and mire,
Parsin' bech32, base58, and hex like fire.
It checks the cache, hums metrics soft and slow,
Pushes messages onward where the queues go.
Calm, steady, useful — go on, give it a go.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 59.74% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: relayer api' accurately summarizes the main addition - a new relayer API with protocol-specific transaction hash parsing for push-based message relaying.
Description check	✅ Passed	The description comprehensively covers all required sections: clear summary, motivation, implementation details, testing verification, and design decisions. It exceeds template requirements with protocol matrix and usage examples.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-03-18T11:05:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.49%. Comparing base (bc401f7) to head (a048ffb).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #8364   +/-   ##
=======================================
  Coverage   76.49%   76.49%           
=======================================
  Files         128      128           
  Lines        3416     3416           
  Branches      290      290           
=======================================
  Hits         2613     2613           
  Misses        786      786           
  Partials       17       17

Components	Coverage Δ
core	`87.80% <ø> (ø)`
hooks	`74.55% <ø> (ø)`
isms	`81.46% <ø> (ø)`
token	`87.33% <ø> (ø)`
middlewares	`85.47% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

🧹 Nitpick comments (1)

rust/main/agents/relayer/src/relay_api/handlers.rs (1)

208-240: Rate limiter works, but there's room in me swamp for a wee improvement.

The sliding window approach is functionally correct. Using Vec with retain() is O(n) on each check. For higher throughput scenarios, VecDeque with front-popping would be more efficient, though this is likely fine for expected API load.

♻️ Optional: Use VecDeque for O(1) cleanup

+use std::collections::VecDeque;
+
 pub struct RateLimiter {
-    requests: Vec<u64>,
+    requests: VecDeque<u64>,
     max_requests: usize,
     window_secs: u64,
 }

 impl RateLimiter {
     pub fn new(max_requests: usize, window_secs: u64) -> Self {
         Self {
-            requests: Vec::new(),
+            requests: VecDeque::new(),
             max_requests,
             window_secs,
         }
     }

     pub fn check(&mut self) -> bool {
         let now = SystemTime::now()
             .duration_since(UNIX_EPOCH)
             .unwrap()
             .as_secs();

-        // Remove requests outside the window
-        self.requests.retain(|&t| now - t < self.window_secs);
+        // Remove requests outside the window (from front)
+        while let Some(&oldest) = self.requests.front() {
+            if now - oldest >= self.window_secs {
+                self.requests.pop_front();
+            } else {
+                break;
+            }
+        }

         if self.requests.len() >= self.max_requests {
             return false;
         }

-        self.requests.push(now);
+        self.requests.push_back(now);
         true
     }
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@rust/main/agents/relayer/src/relay_api/handlers.rs` around lines 208 - 240,
Replace the Vec-based timestamp storage in struct RateLimiter with a
VecDeque<u64> to avoid O(n) retain() costs; update RateLimiter::new to
initialize requests as a VecDeque, and change RateLimiter::check to remove
expired timestamps by popping from the front in a loop (or while-let) until the
front timestamp is within window_secs, then check length against max_requests
and push_back(now) when accepting a request; ensure you update imports (use
std::collections::VecDeque) and adjust references to requests.push and
requests.retain to use push_back and front/pop_front semantics.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@rust/main/agents/relayer/src/relay_api/handlers.rs`:
- Around line 208-240: Replace the Vec-based timestamp storage in struct
RateLimiter with a VecDeque<u64> to avoid O(n) retain() costs; update
RateLimiter::new to initialize requests as a VecDeque, and change
RateLimiter::check to remove expired timestamps by popping from the front in a
loop (or while-let) until the front timestamp is within window_secs, then check
length against max_requests and push_back(now) when accepting a request; ensure
you update imports (use std::collections::VecDeque) and adjust references to
requests.push and requests.retain to use push_back and front/pop_front
semantics.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c3087b0f-fc3b-4ecf-b066-2258e88baaea

📥 Commits

Reviewing files that changed from the base of the PR and between d839c54 and a048ffb.

📒 Files selected for processing (1)

rust/main/agents/relayer/src/relay_api/handlers.rs

coderabbitai

♻️ Duplicate comments (1)

rust/main/agents/relayer/src/relay_api/handlers.rs (1)

329-334: ⚠️ Potential issue | 🟠 Major

All extraction failures become 400 Bad Request - needs differentiation.

Look, not every failed extraction is the client's fault. Right now, if the indexer RPC is down or there's a transient network blip, this returns a 400 like the client did something wrong. That's not fair to them, is it?

Infrastructure failures should return 5xx (e.g., ServiceUnavailable), invalid formats should be 400, and "tx not found" could arguably be 404. Collapsing everything into one error type muddles retry semantics for callers.

🐛 Suggested approach: distinguish error types

     .map_err(|e| {
         if let Some(ref metrics) = state.metrics {
             metrics.inc_failure("extraction_failed");
         }
-        ServerError::InvalidRequest(format!("Failed to extract messages: {e}"))
+        // Distinguish between client errors and server errors
+        let err_str = e.to_string();
+        if err_str.contains("not found") || err_str.contains("no messages") {
+            ServerError::NotFound
+        } else if err_str.contains("invalid") || err_str.contains("parse") {
+            ServerError::InvalidRequest(format!("Invalid request: {e}"))
+        } else {
+            ServerError::ServiceUnavailable(format!("Extraction service error: {e}"))
+        }
     })?;

Alternatively, have extract_messages return a typed error enum that maps cleanly to HTTP status codes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@rust/main/agents/relayer/src/relay_api/handlers.rs` around lines 329 - 334,
The current map_err in the call that uses extract_messages collapses every
failure into ServerError::InvalidRequest (400); change this to distinguish error
kinds by updating extract_messages (or the call site) to return/propagate a
typed error enum (e.g., ExtractError { InvalidInput, NotFound,
RemoteUnavailable, Transient }) or match on existing error variants from the
indexer client; then map InvalidInput -> ServerError::InvalidRequest, NotFound
-> ServerError::NotFound (404), and RemoteUnavailable/Transient ->
ServerError::ServiceUnavailable (5xx) while still incrementing metrics via
state.metrics.inc_failure with appropriate labels (e.g., "extraction_invalid",
"extraction_not_found", "extraction_unavailable") so callers see correct HTTP
semantics and retry signals.

🧹 Nitpick comments (2)

rust/main/agents/relayer/src/relay_api/handlers.rs (2)

449-457: Error message could be clearer for unsupported destinations.

The error "No send channel for destination domain" sounds like an internal plumbing problem rather than "this relayer doesn't support that route." A user-friendly message would help callers understand what went wrong without them having to dig through swamp logs.

✨ Suggested clearer error message

         let send_channel = send_channels
             .get(&extracted.destination_domain)
             .ok_or_else(|| {
                 ServerError::InvalidRequest(format!(
-                    "No send channel for destination domain {}",
+                    "Destination domain {} is not supported by this relayer",
                     extracted.destination_domain
                 ))
             })?;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@rust/main/agents/relayer/src/relay_api/handlers.rs` around lines 449 - 457,
The current ServerError produced when
send_channels.get(&extracted.destination_domain) is missing uses a message that
sounds like an internal plumbing error; change the message in the
ServerError::InvalidRequest to a user-friendly one indicating the destination is
unsupported (e.g., "Unsupported destination domain" or "Relayer does not support
destination domain") and include the extracted.destination_domain value so
callers clearly know the route is not supported; update the error construction
in the lookup around send_channels, preserving ServerError::InvalidRequest.

281-289: Consider validating tx_hash for empty string too.

You're checking origin_chain isn't empty, but an empty tx_hash would slip through and fail later in extraction with a less helpful error message. Might as well catch it early.

✨ Suggested validation addition

     // Validate request
     if req.origin_chain.is_empty() {
         if let Some(ref metrics) = state.metrics {
             metrics.inc_failure("invalid_request");
         }
         return Err(ServerError::InvalidRequest(
             "origin_chain cannot be empty".to_string(),
         ));
     }
+
+    if req.tx_hash.is_empty() {
+        if let Some(ref metrics) = state.metrics {
+            metrics.inc_failure("invalid_request");
+        }
+        return Err(ServerError::InvalidRequest(
+            "tx_hash cannot be empty".to_string(),
+        ));
+    }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@rust/main/agents/relayer/src/relay_api/handlers.rs` around lines 281 - 289,
Add the same early validation for req.tx_hash as exists for req.origin_chain:
check if req.tx_hash.is_empty(), increment
state.metrics.inc_failure("invalid_request") when metrics is Some, and return
Err(ServerError::InvalidRequest("tx_hash cannot be empty".to_string())); place
this alongside the existing origin_chain check in the handler so empty tx_hash
is rejected with a clear error before extraction.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@rust/main/agents/relayer/src/relay_api/handlers.rs`:
- Around line 329-334: The current map_err in the call that uses
extract_messages collapses every failure into ServerError::InvalidRequest (400);
change this to distinguish error kinds by updating extract_messages (or the call
site) to return/propagate a typed error enum (e.g., ExtractError { InvalidInput,
NotFound, RemoteUnavailable, Transient }) or match on existing error variants
from the indexer client; then map InvalidInput -> ServerError::InvalidRequest,
NotFound -> ServerError::NotFound (404), and RemoteUnavailable/Transient ->
ServerError::ServiceUnavailable (5xx) while still incrementing metrics via
state.metrics.inc_failure with appropriate labels (e.g., "extraction_invalid",
"extraction_not_found", "extraction_unavailable") so callers see correct HTTP
semantics and retry signals.

---

Nitpick comments:
In `@rust/main/agents/relayer/src/relay_api/handlers.rs`:
- Around line 449-457: The current ServerError produced when
send_channels.get(&extracted.destination_domain) is missing uses a message that
sounds like an internal plumbing error; change the message in the
ServerError::InvalidRequest to a user-friendly one indicating the destination is
unsupported (e.g., "Unsupported destination domain" or "Relayer does not support
destination domain") and include the extracted.destination_domain value so
callers clearly know the route is not supported; update the error construction
in the lookup around send_channels, preserving ServerError::InvalidRequest.
- Around line 281-289: Add the same early validation for req.tx_hash as exists
for req.origin_chain: check if req.tx_hash.is_empty(), increment
state.metrics.inc_failure("invalid_request") when metrics is Some, and return
Err(ServerError::InvalidRequest("tx_hash cannot be empty".to_string())); place
this alongside the existing origin_chain check in the handler so empty tx_hash
is rejected with a clear error before extraction.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6a035b9f-cbc1-475b-82be-fe4e0ea514c5

📥 Commits

Reviewing files that changed from the base of the PR and between bfb78a8 and 5535f77.

📒 Files selected for processing (2)

rust/main/agents/relayer/src/relay_api/handlers.rs
rust/main/agents/relayer/src/relayer/tests.rs

paulbalaji

Re-review on `d68b2b6`

Prior Devin/CodeRabbit threads are all resolved — nice progress. Three findings remain before approval (one blocker), plus two broader notes.

Good catches to credit:

Two-pass validate-then-commit refactor in the handler
Explicit EVM-only scope narrowing with the clear comment at rust/main/hyperlane-core/src/traits/indexer.rs:78-89
HTTP 404 transient detection replacing the earlier substring matching in ccip_read
fail_fast mode opt-in, so normal relayer behavior is unchanged

Non-inline notes

Medium — POST /relay has no authentication, and the rate limiter is a single process-global bucket (100/60s). The CCTP-V2 gate narrows the blast radius but each accepted request still drives RPC work on the destination mailbox and Circle's attestation service. Consider an API-key header, per-IP limits, or binding to a private interface in production. Worth documenting the intended deployment model.

Medium — the relay_api module has no unit tests. 655 LoC of handler logic: validate-then-commit, TxHashCache behavior under full/duplicate/ok, rate-limit window semantics, CCTP-V2 gate rejection, fail-fast retry budget, partial-send handling, and the CORS parser. All untested. Given the security-facing nature of this endpoint, coverage would be worth adding before the first deploy.

Net: not blocking on style. Finding #1 (partial-commit + dedup lockout) is the one to resolve before this lands.

paulbalaji

Consolidated re-review (commit `dddf9aea07`)

Status: requesting changes. Two high-severity items + three mediums below. This is a follow-up re-review that accounts for prior devin/coderabbit/author exchanges — items already resolved (Box::leak, RwLock poisoning, DB-before-send, ISM-outside-timeout, rate-limit zero validation, empty tx hash, dedup-before-extraction, now−t clock-skew underflow, partial-send duplication, multi-message extraction) are not re-raised, and the devin parse_tx_hash H512 padding concern is independently verified to be a false positive (hyperlane-core's impl_fixed_hash_conversions!(H512, H256) in hyperlane-core/src/types/primitive_types.rs reads bottom-aligned bytes [32..64], which matches the left-pad; the conversion round-trips correctly for EVM hashes). Kudos to the author for catching that.

High severity — blockers

fetch_raw_logs_and_meta signature flip (Err → Ok(None) on missing receipt) regressed retry semantics for all EVM tx-id indexing, not just /relay. See inline on contracts/utils.rs.
TxHashCache is a read-then-write TOCTOU: concurrent identical requests all pass the early contains() read-lock, amplify phase-1 RPCs, and enqueue duplicate PendingMessages before the write-lock commit. See inline on relay_api/handlers.rs:610.

Medium

CCTP V2 gate is per-tx, not per-message — a tx mixing DepositForBurn(V2) with an unrelated Dispatch fast-paths the unrelated one. Inline on relay_api/extractor.rs.
TS SDK parity gap: three new relayApi* settings keys are absent from typescript/sdk/src/metadata/agentConfig.ts and typescript/infra/src/config/agent/relayer.ts, despite the parity contract at the top of settings/mod.rs. Inline on settings/mod.rs.
No handler-level tests for /relay. The coverage in relayer/tests.rs only exercises config plumbing — dedupe race, rate limit, CCTP gating, cache-full-after-send, concurrent same-tx, and partial multi-destination send are uncovered. Inline on relay_api/handlers.rs:327.

PR body is stale

The description still advertises Solana/Radix/Aleo push-relay support, but the diff ships no chain-specific parse_*_tx_hash functions and /relay is gated to EVM CCTP V2 via is_cctp_v2 in extractor.rs. Please update the description so reviewers and deployers aren't misled.

Observations outside this PR

rust/main/hyperlane-base/src/contract_sync/mod.rs:216-220 is the caller that silently drops tx_ids on the new Ok(None) path above — this is the blast radius beyond /relay for High #1.
Rate limiter is a single global RwLock<RateLimiter> with no per-IP / per-key dimension; combined with the TOCTOU above, an attacker can amplify per-request RPC cost ~22× (2 + 10×recipient_ism + 10×get_app_context) within the global quota. Not a blocker on its own; consider tightening defaults and parallelizing the per-message RPCs with join_all.

paulbalaji

Delta re-review — HEAD `7e9d7dad88`

Follow-up to #8364 (review). I re-read every threaded author reply since that review; nothing currently ends on an unaddressed author response except the two Mediums carried forward below. Non-blocking comment review — a few items worth a second look before merge.

Resolved since the last pass ✅

High 1 (fetch_raw_logs_and_meta retry regression): fixed. contracts/utils.rs still returns Ok(None), but mailbox.rs, interchain_gas.rs, and merkle_tree_hook.rs now convert None → Err inside call_and_retry_indefinitely, restoring pre-PR retry semantics for the shared contract_sync path.
High 2 (TxHashCache TOCTOU): fixed. handlers.rs:395-412 reserves the dedup slot under a single write lock before any RPC / send / DB work, with remove() rollback on the error path. test_concurrent_same_tx_only_one_succeeds exercises the race.
Medium 5 (no /relay tests): largely addressed — 11 tests in relay_api/tests.rs cover happy path, dedup within/after TTL, concurrency, rate limit, timeout, CCTP gate (happy), whitelist/blacklist, cache-full-before-send, partial-send release.
Acknowledged author replies: the H512 padding dispute was a false positive, max_retries = 3 is documented, the "does not apply here" reply on utils.rs:30 is now accurate because the retry wrapper is in the callers.
7e9d7dad88 cleanly reuses the already-built message indexer instead of standing up a second one (relayer/origin.rs, relayer.rs:644-648).

Still open — please address

Test vs. code disagree on extraction-timeout status — new test expects 408, current code returns 400. See inline on extractor.rs:45 and tests.rs:357.
CCTP V2 gate is per-tx, not per-message — acknowledged the author's Iris-attestation reasoning, but it addresses spoofed burns, not the mixed-message case. See inline on extractor.rs:96.
TS SDK parity regressed — four Rust-side keys now lack a TS mirror despite the parity contract. See inline on settings/mod.rs:77.
Reservation leak on handler cancellation — minor / defense-in-depth: rollback only runs if relay_work returns. See inline on handlers.rs:415.

PR description is still stale

The summary still claims push-relay support for Solana, Radix, and Aleo, but the current diff ships no chain-specific parse_*_tx_hash and /relay is CCTP-V2-gated to EVM via extract_messages. Please update so downstream consumers aren't misled.

Observations outside this PR (not blocking)

hyperlane-base/src/server/base_server.rs:27-49 uses plain axum::serve with no tower::timeout / GracefulShutdown layer today, which is partly why the cancellation leak above is not an immediate exploit. If that wrapper is ever added, the reservation-leak becomes real.
rate_limiter.check() at handlers.rs consumes a token before the dedup reservation, so duplicate requests within TTL still burn rate-limit quota. Minor; not a regression.

paulbalaji · 2026-04-21T11:42:58Z

CI failure — same issue flagged inline in my last review

The failing job on run 24719806500 is:

thread 'relay_api::tests::test_extraction_timeout' panicked at agents/relayer/src/relay_api/tests.rs:357:5:
assertion `left == right` failed
  left: 400
 right: 408

This is the exact mismatch flagged inline on extractor.rs:45 in review #4147343345: the 5 s inner timeout is wrapped in eyre!(...) at extractor.rs:33-48, the handler in handlers.rs:439-443 maps any eyre::Error from extract_messages to ServerError::InvalidRequest → HTTP 400. The test expects StatusCode::REQUEST_TIMEOUT (408), which is the correct semantic.

Minimal fix

Surface the inner-timeout case distinctly so the handler can map it to RequestTimeout. Smallest diff:

1. rust/main/agents/relayer/src/relay_api/extractor.rs — introduce a typed error and return it on the inner timeout:

use thiserror::Error;

#[derive(Debug, Error)]
pub enum ExtractError {
    #[error("timed out fetching transaction receipt")]
    Timeout,
    #[error(transparent)]
    Other(#[from] eyre::Error),
}

pub async fn extract_messages(
    indexers: &HashMap<String, Arc<dyn Indexer<HyperlaneMessage>>>,
    chain_name: &str,
    tx_hash: &str,
) -> Result<Vec<ExtractedMessage>, ExtractError> {
    // ...
    let messages_with_meta = tokio::time::timeout(
        Duration::from_secs(5),
        indexer.fetch_logs_by_tx_hash(tx_hash_512),
    )
    .await
    .map_err(|_| {
        error!(chain = %chain_name, tx_hash = %tx_hash, "Timed out waiting for transaction receipt");
        ExtractError::Timeout
    })?
    .map_err(|e| ExtractError::Other(eyre!("Failed to fetch transaction logs: {}", e)))?;
    // ...
}

Also re-export the new error from relay_api/mod.rs.

2. rust/main/agents/relayer/src/relay_api/handlers.rs:436-443 — branch on the error variant:

.map_err(|e| match e {
    ExtractError::Timeout => {
        state.record_failure("timeout");
        ServerError::RequestTimeout
    }
    ExtractError::Other(err) => {
        state.record_failure("extraction_failed");
        ServerError::InvalidRequest(format!("Failed to extract messages: {err}"))
    }
})?;

The outer tokio::time::timeout(Duration::from_secs(10), ...) at handlers.rs:431-435 already maps to ServerError::RequestTimeout, so both timeout paths now produce the same 408, and only genuine input/RPC errors remain as 400.

Alternative (one-line fix)

If you'd rather not add an error type, drop the 5 s inner timeout entirely and rely on the 10 s outer timeout. The tradeoff is exactly what the comment at extractor.rs:28-32 already mentions — an invalid or not-yet-confirmed tx hash will burn the full 10 s budget before returning. The typed-error approach preserves the fast-fail and also unblocks CI.

Not re-posting the four items from the consolidated review above — they still stand as-is after this fix.

devin-ai-integration

Devin Review found 4 new potential issues.

View 10 additional findings in Devin Review.

devin-ai-integration · 2026-04-21T12:09:06Z

+    if status == reqwest::StatusCode::NOT_FOUND {
+        return Err(MetadataBuildError::AttestationPending);
+    }


🚩 HTTP 404 universally treated as AttestationPending for all CCIP read URLs

At rust/main/agents/relayer/src/msg/metadata/ccip_read/mod.rs:369-371, any HTTP 404 response from any CCIP read URL triggers AttestationPending, causing up to 30 retries at 1-second intervals. The CCIP URLs come from on-chain ISM contracts and could point to any server. While the comment explains this is designed for Circle's attestation service, a misconfigured or non-Circle URL that legitimately returns 404 (e.g., wrong path) will waste ~30 seconds per URL before moving to the next. This is bounded and non-critical, but could be improved by making the 404→pending mapping configurable or checking a response header/body marker specific to Circle's API.

Was this helpful? React with 👍 or 👎 to provide feedback.

hyper-gonk · 2026-04-22T09:54:48Z

🦀 Rust Agent Docker Image Built Successfully

Service	Tag
agent	`5386d58-20260422-095016`

Full image paths

ghcr.io/hyperlane-xyz/hyperlane-agent:5386d58-20260422-095016

devin-ai-integration

Devin Review found 2 new potential issues.

View 11 additional findings in Devin Review.

devin-ai-integration · 2026-04-22T10:01:15Z

+    let is_cctp_v2 = indexer.is_cctp_v2(tx_hash_512).await.unwrap_or_else(|e| {
+        warn!(
+            chain = %chain_name,
+            tx_hash = %tx_hash,
+            error = ?e,
+            "Failed to check for CCTP V2 burn event, treating as non-CCTP"
+        );
+        false
+    });


🔴 Redundant receipt RPC call in is_cctp_v2 silently rejects valid CCTP V2 messages on transient failures

The extract_messages function in extractor.rs:47-68 first calls indexer.fetch_logs_by_tx_hash(tx_hash_512) which internally calls get_transaction_receipt (via call_and_retry_indefinitely). After successfully extracting messages, it then calls indexer.is_cctp_v2(tx_hash_512) at line 96, which makes a second, independent get_transaction_receipt RPC call (rust/main/chains/hyperlane-ethereum/src/contracts/mailbox.rs:213-217). If this second call fails due to a transient RPC error, rate-limiting, or load-balancer routing to a different node, the unwrap_or_else on line 96 silently converts the error to false. The relay API then rejects the valid CCTP V2 message with a misleading "Only EVM CCTP V2 messages are supported" error. This is both a correctness issue (valid messages rejected) and a performance issue (double RPC call for the same receipt).

Prompt for agents

The extract_messages function in extractor.rs makes two separate get_transaction_receipt RPC calls for the same tx hash: once inside fetch_logs_by_tx_hash (line 49) and again inside is_cctp_v2 (line 96). When the second call fails transiently, the unwrap_or_else converts the error to false, causing valid CCTP V2 messages to be rejected with a misleading error. The fix should combine the receipt fetch into a single RPC call. Options: 1. Add a new method to the Indexer trait (e.g., fetch_logs_and_check_cctp_v2) that returns both the logs and the CCTP V2 status from a single receipt. 2. Have fetch_logs_by_tx_hash return the raw receipt alongside the parsed logs, so is_cctp_v2 can inspect the already-fetched receipt. 3. Add a variant of is_cctp_v2 that accepts a receipt directly instead of a tx hash. Alternatively, if keeping two calls, the is_cctp_v2 error should be propagated as ExtractError::Failed rather than silently treated as false. A transient RPC failure should not be treated as a definitive non-CCTP-V2 determination. Relevant files: rust/main/agents/relayer/src/relay_api/extractor.rs (lines 47-104), rust/main/chains/hyperlane-ethereum/src/contracts/mailbox.rs (is_cctp_v2 method at line 202), rust/main/hyperlane-core/src/traits/indexer.rs (is_cctp_v2 trait default at line 88).

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-22T10:01:16Z

+            .route("/relay", post(create_relay))
+            .layer(cors)
+            .with_state(self)
+    }


🚩 No authentication on relay API endpoint — mitigated by rate limiter, dedup cache, and CCTP V2 gate

The /relay endpoint at handlers.rs:277 has no authentication mechanism. Any client that can reach the server can submit relay requests. The risk is mitigated by: (1) the rate limiter (default 100 req/60s), (2) the dedup TxHashCache (prevents replay within TTL), (3) the CCTP V2 check (only genuine Circle DepositForBurn transactions are accepted), and (4) whitelist/blacklist filtering. Since Circle's attestation service is the ultimate gate (without a genuine attestation, the message can't be delivered), an attacker can cause wasted RPC calls but not actual fund theft. Still, the lack of authentication should be documented and potentially addressed if the relay API is exposed to the public internet.

Was this helpful? React with 👍 or 👎 to provide feedback.

paulbalaji

Delta re-review — HEAD `5386d585df`

Follow-up to #8364 (review). Re-read all 32 troykessler threaded replies (32 total in-tree). Non-blocking comment review, but two items still need attention before merge — plus a CI fix.

Resolved since last pass ✅

test_extraction_timeout 408 vs 400: fixed via typed ExtractError::{Timeout,Failed} enum in extractor.rs; handler in handlers.rs:492-501 maps Timeout → ServerError::RequestTimeout (408). Author confirmed pass now at 2026-04-21T11:56Z.
Reservation leak on handler cancellation: fixed via TxHashReservation RAII guard at handlers.rs:112-155, with switch from tokio::sync::RwLock<TxHashCache> → parking_lot::Mutex<TxHashCache>. Drop path releases on cancel; commit() consumes self on success. Clean.
TS SDK parity: author replied 2026-04-21T11:57Z on settings/mod.rs:77 with we dont plan on implementing the relayer api in the ts relayer. Accepted — downgrading to nit below (just update the stale parity-contract comment).
Prior Highs (fetch_raw_logs_and_meta retry regression, TxHashCache TOCTOU): still resolved on this head.

Still open — please address

🔴 CI lint — 3 clippy errors in extractor.rs block test-rs-run. Inline below; trivial fix.
🔴 Double-RPC in is_cctp_v2 silently rejects valid CCTP V2 messages on transient RPC blip (flagged by @devin-ai-integration at 2026-04-22T10:01Z, independently confirmed). Author's older we dont care reply at 2026-04-20T14:11Z on extractor.rs:96 predates this double-RPC finding and the current misleading-400 path. Inline below.
🟠 Second eviction pass added in 5386d585df is a no-op. Inline below.
Nit: settings/mod.rs:5 still says "ANY CHANGES HERE NEED TO BE REFLECTED IN THE TYPESCRIPT SDK" — given the author's "not implementing in TS relayer" stance, the comment is now stale. Inline below.

CI (lint failure)

Run 24771827307 fails with 3 clippy uninlined_format_args errors, all in rust/main/agents/relayer/src/relay_api/extractor.rs. Fix recipe in the inline. No other clippy/rustfmt regressions observed.

Overall

Not blocking structurally — most work is done well — but the double-RPC silent-false and the CI lint failure should close before merge. Not re-raising the withdrawn per-tx CCTP gate thread; author has explicitly skipped it.

paulbalaji · 2026-04-22T11:12:56Z

+) -> Result<Vec<ExtractedMessage>, ExtractError> {
+    // Get indexer for chain
+    let indexer = indexers.get(chain_name).ok_or_else(|| {
+        ExtractError::Failed(format!("Chain not found in registry: {}", chain_name))


[CI blocker] Three clippy uninlined_format_args errors in this file are blocking test-rs-run on CI run 24771827307:

error: variables can be used directly in the `format!` string --> agents/relayer/src/relay_api/extractor.rs:28:30 --> agents/relayer/src/relay_api/extractor.rs:40:43 --> agents/relayer/src/relay_api/extractor.rs:67:30

Fix (three 1-line edits in this file):

-ExtractError::Failed(format!("Chain not found in registry: {}", chain_name)) +ExtractError::Failed(format!("Chain not found in registry: {chain_name}")) -.map_err(|e| ExtractError::Failed(format!("Invalid tx hash format: {}", e)))?; +.map_err(|e| ExtractError::Failed(format!("Invalid tx hash format: {e}")))?; -ExtractError::Failed(format!("Failed to fetch transaction logs: {}", e)) +ExtractError::Failed(format!("Failed to fetch transaction logs: {e}"))

cargo clippy -p relayer --all-targets --features aleo,integration_test -- -D warnings should then pass locally. (cargo fix --lib -p relayer --tests applies all three automatically.)

paulbalaji · 2026-04-22T11:12:56Z

+
+    // Check once per tx whether this is a CCTP fast transfer.
+    // Errors are treated as false — the relay API will reject the request below.
+    let is_cctp_v2 = indexer.is_cctp_v2(tx_hash_512).await.unwrap_or_else(|e| {


[High] The is_cctp_v2 call here performs a second, unretried get_transaction_receipt RPC against origin chain. Trace:

L49 indexer.fetch_logs_by_tx_hash(tx_hash_512) → mailbox.rs:176-189 → call_and_retry_indefinitely around fetch_raw_logs_and_meta → one get_transaction_receipt. Retried on failure.

L96 indexer.is_cctp_v2(tx_hash_512) → mailbox.rs:213-217 → direct self.provider.get_transaction_receipt(...). No retry wrapper, no timeout.

On transient failure, unwrap_or_else(|_| false) here (L96-104) silently converts the error to false → handlers.rs:555 returns ServerError::InvalidRequest("Only EVM CCTP V2 messages are supported via the relay API") → HTTP 400.

The 400 is actively misleading: it tells a client the tx is ineligible when it actually is CCTP V2 — the second receipt fetch just flaked (common with per-node LB cache inconsistency on busy chains). Clients see "not CCTP V2" and don't retry.

Your older 2026-04-20 14:11Z reply on this thread (we dont care since this will be picked up a few seconds later by the relayer contract indexer) predates this double-RPC finding — the indexer recovery path doesn't fix the misleading response body or the dedup-cache reservation that the client doesn't know to retry past.

Fix options, best to minimum:

Best: extend the indexer trait so fetch_logs_by_tx_hash returns logs + CCTP flag from a single receipt fetch. Saves the second RPC on every request, not just error paths.

Minimum: propagate is_cctp_v2 error as ExtractError::Failed (or a new ExtractError::Transient → 503) instead of .unwrap_or_else(|_| false). Eliminates the misleading 400.

Wrap the inner get_transaction_receipt in mailbox.rs::is_cctp_v2 with call_and_retry_indefinitely, matching the other call site.

paulbalaji · 2026-04-22T11:12:56Z

+            }
+        }
+
+        // Enforce max size — run eviction again if still at capacity


[Low] The second eviction pass added in 5386d585df is a no-op.

First pass (L86-99) runs when self.cache.len() > self.max_entries * 3 / 4

Second pass (L100-105) runs when self.cache.len() >= self.max_entries

For any max_entries >= 1, whenever the second condition is true the first already ran, and both use the same now, same ttl, same retain predicate — retain is idempotent, so the second pass evicts exactly nothing the first didn't.

If the intent was to catch entries that expired between the two checks, now must be refreshed before the second pass:

// Enforce max size — run eviction again if still at capacity if self.cache.len() >= self.max_entries { + let now = Instant::now(); let ttl = self.ttl; self.cache .retain(|_, &mut timestamp| now.duration_since(timestamp) < ttl); }

In practice the time delta between the two passes is microseconds, so even with the refresh it evicts ~nothing. Cleanest is to delete the added block; the existing 75%-threshold pass already handles it.

troykessler self-assigned this Mar 18, 2026

github-project-automation Bot added this to Hyperlane Tasks Mar 18, 2026

github-project-automation Bot moved this to In Review in Hyperlane Tasks Mar 18, 2026

troykessler marked this pull request as ready for review March 18, 2026 10:45

troykessler requested review from tkporter and yjamin as code owners March 18, 2026 10:45