ING-1399: Fix flakiness in graceful shutdown tests#355
Conversation
634e895 to
1b0734b
Compare
3a81055 to
85426a2
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (2)
gateway/gateway.go:444
ctx, cancel := context.WithCancel(ctx)is created beforesystem.NewSystem(...). IfNewSystemreturns an error,cancelis never called, which can leak any goroutines tied to that derived context. Consider callingcancel()on the error path (or deferring it until afterNewSystemsucceeds).
ctx, cancel := context.WithCancel(ctx)
config.Logger.Info("initializing protostellar system")
gatewaySys, err := system.NewSystem(&system.SystemOptions{
Logger: config.Logger.Named("gateway-system"),
DataImpl: dataImpl,
DapiImpl: dapiImpl,
Metrics: metrics.GetSnMetrics(),
RateLimiter: rateLimiter,
GrpcTlsConfig: &tls.Config{
ClientCAs: config.ClientCaCert,
ClientAuth: tls.VerifyClientCertIfGiven,
GetCertificate: func(chi *tls.ClientHelloInfo) (*tls.Certificate, error) {
return g.atomicGrpcCert.Load(), nil
},
},
DapiTlsConfig: &tls.Config{
GetCertificate: func(chi *tls.ClientHelloInfo) (*tls.Certificate, error) {
return g.atomicDapiCert.Load(), nil
},
ClientCAs: config.ClientCaCert,
ClientAuth: tls.VerifyClientCertIfGiven,
},
ShutdownTimeout: config.ShutdownTimeout,
AlphaEndpoints: config.AlphaEndpoints,
Debug: config.Debug,
Cancel: cancel,
})
if err != nil {
config.Logger.Error("error creating legacy proxy")
return err
}
gateway/test/dapi_graceful_shutdown_test.go:129
- The response body is never closed in the request loop. Not closing
resp.Bodycan leak connections/resources and can also affect keep-alive behavior, which is relevant for a shutdown test. Consider reading/discarding the body (if needed) and callingresp.Body.Close()each iteration.
return
}
respCloseChan <- resp.Close
time.Sleep(time.Millisecond * 10)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
b1bf2f4 to
59073c5
Compare
f0e6b22 to
4a49b6c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f526391 to
7d5604f
Compare
ING-1399: Use hooks to delay grpc request server side
7d5604f to
80e2fb1
Compare
1b7c9e6 to
892f9fb
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
892f9fb to
00219a9
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Logger *zap.Logger | ||
| ExecResult interface{} | ||
| ExecError error | ||
| executed bool |
There was a problem hiding this comment.
Why did this need to be added? I think the intention was that hooks were required to have 'execute' as part of them?
There was a problem hiding this comment.
It was originally added because we call ServeHTTP in runACtion_Execute and don't want to call it again in the Run() function. Since ServeHTTP does.t return anything we couldn't do things the same way as grpc where we set a fields for the resp and err and uses that to decide if we ned to execute the request later.
In the latest commit I have added an interceptor writer that allows us to access the http response inside the actions like we do for grpc and we can check if the request has been executed based on that.
| HooksContext *HooksContext | ||
| Handler grpc.UnaryHandler | ||
| HTTPHandler http.Handler | ||
| HTTPWriter http.ResponseWriter |
There was a problem hiding this comment.
Why does the HTTPWriter need to be included here? Shouldn't it flow through the result the same way we do in GRPC, with the exception that we write it out differently at the end?
There was a problem hiding this comment.
It was included because we actually server the request and wrote the response inside of run state. I;ve changed it so that we use an interceptor to get the http response/error which we now propagate back up to HandleHTTPRequest and then we write the response there.
417ba04 to
9f2821d
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
9f2821d to
2fcf4a2
Compare
This PR changes how we test graceful shutdown, we now ensure that:
To support configurable server side delays we use the barrier hooks for gRPC. This meant that we had to expose the hooks manager through the startup call back, as we can't use an RPC to unblock the hook as the server is shutting down so this must be done in process. Such middleware was not implemented for Data API so this has been added and put behind the Debug flag.
The original plan was to use slow running queries. However the time taken for a query to execute is non-deterministic so this approach would just introduce a different type of flakiness.