I would like to trace my clue health checks.
I would also like to bound the maximum time that any Ping will take so that I can use Checker to enforce a service-level objectives on maximum request duration without creating a thundering herd of overlapping pings that consume resources. (Basically: cancel the client context to release any server resources that are being held by "deep pings" to critical dependencies.)
Instrumenting health seems straightforward; see #586 for a proof of concept. My primary question is whether we should build OTel awareness into the health package, but given our go.mod already depends on it, seems harmless to do so.
Adding a timeout to pingers also seems easy; we could add an Option and trigger a context.WithTimeout when the option is provided.
I would like to trace my clue health checks.
I would also like to bound the maximum time that any
Pingwill take so that I can useCheckerto enforce a service-level objectives on maximum request duration without creating a thundering herd of overlapping pings that consume resources. (Basically: cancel the client context to release any server resources that are being held by "deep pings" to critical dependencies.)Instrumenting
healthseems straightforward; see #586 for a proof of concept. My primary question is whether we should build OTel awareness into thehealthpackage, but given ourgo.modalready depends on it, seems harmless to do so.Adding a timeout to pingers also seems easy; we could add an
Optionand trigger acontext.WithTimeoutwhen the option is provided.