Skip to content

otelgrpc: suppress NotFound errors for client spans#316

Draft
lucasmeijer wants to merge 1 commit intobuildbarn:mainfrom
lucasmeijer:lucas/otel-client-notfound-span-status
Draft

otelgrpc: suppress NotFound errors for client spans#316
lucasmeijer wants to merge 1 commit intobuildbarn:mainfrom
lucasmeijer:lucas/otel-client-notfound-span-status

Conversation

@lucasmeijer
Copy link
Contributor

I do not expect this PR to be accepted. I'm posting it as an illustration of a problem I have: in Bonanza, UploadObject traces are marked as errors when gRPC returns status code 5 (NotFound). That status is expected and valid in the protocol, but it causes my Bonanza OTel traces to contain a lot of noise that looks like real errors.

I’m not happy with how much boilerplate this approach requires, and I’m very open to a different direction.

@aspect-workflows
Copy link

aspect-workflows bot commented Feb 3, 2026

Test

25 test targets passed

Targets
//pkg/auth:auth_test [k8-fastbuild]                                                           892ms
//pkg/blobstore/buffer:buffer_test [k8-fastbuild]                                             123ms
//pkg/blobstore/completenesschecking:completenesschecking_test [k8-fastbuild]                 142ms
//pkg/blobstore/grpcclients:grpcclients_test [k8-fastbuild]                                   129ms
//pkg/blobstore/grpcservers:grpcservers_test [k8-fastbuild]                                   189ms
//pkg/blobstore/local:local_test [k8-fastbuild]                                               130ms
//pkg/blobstore/mirrored:mirrored_test [k8-fastbuild]                                         77ms
//pkg/blobstore/readcaching:readcaching_test [k8-fastbuild]                                   100ms
//pkg/blobstore/readfallback:readfallback_test [k8-fastbuild]                                 64ms
//pkg/blobstore/replication:replication_test [k8-fastbuild]                                   122ms
//pkg/blobstore/sharding/integration:integration_test [k8-fastbuild]                          109ms
//pkg/blobstore/sharding:sharding_test [k8-fastbuild]                                         1s
//pkg/blobstore:blobstore_test [k8-fastbuild]                                                 135ms
//pkg/builder:builder_test [k8-fastbuild]                                                     121ms
//pkg/capabilities:capabilities_test [k8-fastbuild]                                           155ms
//pkg/digest:digest_test [k8-fastbuild]                                                       69ms
//pkg/filesystem/path:path_test [k8-fastbuild]                                                134ms
//pkg/grpc:grpc_test [k8-fastbuild]                                                           150ms
//pkg/http/server:server_test [k8-fastbuild]                                                  134ms
//pkg/jmespath:jmespath_test [k8-fastbuild]                                                   452ms
//pkg/jwt:jwt_test [k8-fastbuild]                                                             210ms
//pkg/otel:otel_test [k8-fastbuild]                                                           107ms
//pkg/prometheus:prometheus_test [k8-fastbuild]                                               56ms
//pkg/util:util_test [k8-fastbuild]                                                           80ms
//pkg/x509:x509_test [k8-fastbuild]                                                           130ms

Total test execution time was 5s. 4 tests (13.8%) were fully cached saving 342ms.

@EdSchouten
Copy link
Member

I think this is something that should be raised on the OpenTelemetry side. We just use the stock middleware here.

@lucasmeijer
Copy link
Contributor Author

i suspect i'm not going to get far over there:

The grpc client semantics conventions explicitely say:

[2] rpc.response.status_code: All status codes except OK SHOULD be considered errors.

(from https://opentelemetry.io/docs/specs/semconv/rpc/grpc/ )

maybe the solution is just that I give up on this idea that traces that were fine have no errors.
or, I could work on making bonanza not use grpc status code 5 for reporting something doesn't exist, but have a proto response for that scenario (and keep the status code 5 for where we consider not finding the object an actual error)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments