Skip to content

Commit 71764bf

Browse files
authored
fix: stabilize flaky CI (tests-mac-linux (ubuntu-24.04)) (#19187)
# fix: handle empty JSON-RPC responses during server shutdown ## Problem The CI test `TestInvalidReceiptHashHighMgas` in `execution/tests` fails intermittently with: ``` unexpected end of JSON input ``` This is a race condition between the RPC server shutting down (during test cleanup) and in-flight engine API requests. When the server's `run` flag is set to `false` during shutdown, `serveSingleRequest` returns `nil`, and `ServeHTTP` sends a 200 OK response with an empty body. The client then tries to `json.Unmarshal` the empty body, producing the "unexpected end of JSON input" error. Because this error doesn't match any of the configured retryable error patterns (only "connection refused" was retried), the backoff library treats it as a permanent failure and the test fails immediately. ## Root Cause The failure chain: 1. Server shutdown sets `s.run` to `false` 2. An in-flight request reaches `ServeHTTP` → `serveSingleRequest` 3. `serveSingleRequest` checks `!s.run.Load()` and returns `nil` (no error message) 4. `ServeHTTP` sees `nil` error, writes no response body, returns 200 OK with empty body 5. Client's `doRequest` returns empty `[]byte{}` with no error (status 200 passes the check) 6. `sendHTTP` calls `json.Unmarshal([]byte{}, &respMsg)` → "unexpected end of JSON input" 7. `maybeMakePermanent` doesn't recognize this as retryable → `backoff.Permanent(err)` 8. Test fails ## Fix Three-part defensive fix addressing both server and client sides: ### 1. Server-side: Early shutdown check in `ServeHTTP` (`rpc/http.go`) Added a check for `s.run.Load()` immediately after request validation. If the server is stopped, return 503 Service Unavailable instead of proceeding to `serveSingleRequest` which would silently produce an empty response. ### 2. Client-side: Reject empty response bodies (`rpc/http.go`) Added a check in `doRequest` that returns a clear error ("empty response from JSON-RPC server") when the response body is empty, instead of letting `json.Unmarshal` fail with the cryptic "unexpected end of JSON input". ### 3. Client-side: Default retryable error checkers (`execution/engineapi/engine_api_jsonrpc_client.go`) Added default retryable error checkers in `DialJsonRpcClient` that always retry on: - `"empty response from JSON-RPC server"` — transient empty responses during server restart/shutdown - `"503 Service Unavailable"` — server explicitly signaling it's not ready These defaults are prepended to any caller-supplied checkers, ensuring transient server unavailability is always retried regardless of the caller's configuration. ## Testing - Ran `TestInvalidReceiptHashHighMgas` 12+ consecutive times — all passed - `make lint` passes with 0 issues - `make erigon` builds successfully - `make integration` builds successfully ## Files Changed - `rpc/http.go` — Server-side 503 on shutdown + client-side empty body rejection - `execution/engineapi/engine_api_jsonrpc_client.go` — Default retryable error checkers
1 parent b768625 commit 71764bf

File tree

2 files changed

+19
-0
lines changed

2 files changed

+19
-0
lines changed

execution/engineapi/engine_api_jsonrpc_client.go

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,15 @@ func DialJsonRpcClient(url string, jwtSecret []byte, logger log.Logger, opts ...
102102
if err != nil {
103103
return nil, err
104104
}
105+
106+
// Always retry on transient server errors (e.g., server shutting down returning
107+
// empty response or 503 Service Unavailable).
108+
defaultCheckers := []RetryableErrChecker{
109+
ErrContainsRetryableErrChecker("empty response from JSON-RPC server"),
110+
ErrContainsRetryableErrChecker("503 Service Unavailable"),
111+
}
112+
options.retryableErrCheckers = append(defaultCheckers, options.retryableErrCheckers...)
113+
105114
res := &JsonRpcClient{
106115
rpcClient: client,
107116
maxRetries: options.maxRetries,

rpc/http.go

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,10 @@ func (hc *httpConn) doRequest(ctx context.Context, msg any) ([]byte, error) {
181181
return nil, fmt.Errorf("%s: %s", resp.Status, string(respBody))
182182
}
183183

184+
if len(respBody) == 0 {
185+
return nil, errors.New("empty response from JSON-RPC server")
186+
}
187+
184188
return respBody, nil
185189
}
186190

@@ -246,6 +250,12 @@ func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
246250
return
247251
}
248252

253+
// Don't serve if server is stopped.
254+
if !s.run.Load() {
255+
w.WriteHeader(http.StatusServiceUnavailable)
256+
return
257+
}
258+
249259
// Create request-scoped context.
250260
connInfo := PeerInfo{Transport: "http", RemoteAddr: r.RemoteAddr}
251261
connInfo.HTTP.Version = r.Proto

0 commit comments

Comments
 (0)