Skip to content

fix(coprocessor): host-listener, timeout slow reconnection for catchup#1989

Merged
mergify[bot] merged 1 commit intomainfrom
rudy/fix/main/slow-reconnect-catchup-mainnet
Feb 16, 2026
Merged

fix(coprocessor): host-listener, timeout slow reconnection for catchup#1989
mergify[bot] merged 1 commit intomainfrom
rudy/fix/main/slow-reconnect-catchup-mainnet

Conversation

@rudy-6-4
Copy link
Copy Markdown
Contributor

No description provided.

@rudy-6-4 rudy-6-4 requested a review from a team as a code owner February 13, 2026 16:10
@cla-bot cla-bot bot added the cla-signed label Feb 13, 2026
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

#1988

@rudy-6-4 rudy-6-4 force-pushed the rudy/fix/main/slow-reconnect-catchup-mainnet branch from f7b1468 to 8ce6f26 Compare February 13, 2026 16:18
@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Feb 13, 2026

Claude finished @rudy-6-4's task in 2m 21s —— View job


[AUTOMATED]

PR Review: fix(coprocessor): host-listener, timeout slow reconnection for catchup

  • Pre-review validation
  • Gather context (PR diff, guidelines)
  • Run parallel review agents
  • Post final review

Summary

This PR adds a tokio::time::timeout wrapper around the ProviderBuilder::new().connect_ws(ws) call in get_blocks_logs_range_no_retry(), preventing the catchup path from hanging indefinitely when WebSocket connection establishment is slow. This is a targeted, well-scoped fix.

Findings

No bugs found. All four review agents (2× bug hunter, error handling auditor, guideline compliance) confirmed the code is correct:

  • Type handling — The nested Result<Result<Provider, Error>, Elapsed> from the timeout wrapper is correctly handled with three exhaustive match arms
  • Error propagation — All errors are properly propagated via anyhow::bail! to consume_catchup_blocks(), which has appropriate retry logic
  • Resource safety — When the timeout fires, the dropped connect_ws future cleans up the in-progress TCP/WebSocket connection
  • Timeout value — Uses self.timeout_request_websocket (default 15s), consistent with all other RPC timeouts in the file
  • Improved error messages — The old code discarded the connection error (Err(_) => bail!("Cannot get a provider")); the new code preserves it ("Cannot get provider for logs range due to {err}")

Nit: Minor style inconsistency

The new code splits the timeout creation and .await into two separate statements:

let provider = tokio::time::timeout(
    Duration::from_secs(self.timeout_request_websocket),
    ProviderBuilder::new().connect_ws(ws),
);
let provider = match provider.await { ... };

The get_logs timeout in the same function (line 348) chains .await directly:

match tokio::time::timeout(
    Duration::from_secs(self.timeout_request_websocket),
    provider.get_logs(&filter),
)
.await
{
    ...
}

Nit: Consider matching the existing pattern for consistency. Not blocking — both forms are functionally identical.

Pre-existing observation (not introduced by this PR)

new_log_stream_no_retry() (line 768) and get_chain_id() (line 260) also call connect_ws without a timeout. These are pre-existing and outside the scope of this fix, but could benefit from the same treatment in a follow-up.

Verdict

LGTM ✅ — Clean, well-targeted fix that prevents slow WebSocket reconnections from blocking the catchup loop. The error handling is correct and the error messages are improved.


@rudy-6-4
Copy link
Copy Markdown
Contributor Author

This has been observed on mainnet that connection for catchup could take up to 1min at bad times.

@mergify
Copy link
Copy Markdown

mergify bot commented Feb 13, 2026

🧪 CI Insights

Here's what we observed from your CI run for 8ce6f26.

🟢 All jobs passed!

But CI Insights is watching 👀

@rudy-6-4
Copy link
Copy Markdown
Contributor Author

@Mergifyio queue

@mergify
Copy link
Copy Markdown

mergify bot commented Feb 16, 2026

Merge Queue Status

Rule: main


This pull request spent 1 hour 58 minutes 42 seconds in the queue, including 1 hour 48 minutes 44 seconds running CI.

Required conditions to merge

mergify bot added a commit that referenced this pull request Feb 16, 2026
@mergify mergify bot merged commit 5b381fb into main Feb 16, 2026
62 checks passed
@mergify mergify bot deleted the rudy/fix/main/slow-reconnect-catchup-mainnet branch February 16, 2026 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants