You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
operator: improve error handling in critical path (#190)
**Problem:** Internal and external operators have seen transient errors
crashing tip router operator. These errors are generally related to a
timed out RPC request. Upon examining loop_stages I have found errors
that perhaps are not handled in the way we would like.
**Solution:**
- Wait for epoch info and schedule rpc requests to come back, log
failures, this state is required to do anything useful with the
operator. A failed request should be handled gracefully, this action is
periodic.
- We should not handle submit_to_ncn in CastVote with `?`, log an error
here, this gives the operator a chance to vote again and recover from
any potential RPC issues that were responsible for the failure.
0 commit comments