Performance issues updating from Bitnami Thanos image to Quay Thanos #8471

nb-ccd · 2025-09-04T16:31:03Z

nb-ccd
Sep 4, 2025

Hi,

We have been running Thanos 0.32.1 through the Bitnami Thanos chart (https://charts.bitnami.com/bitnami) using their default image. In response to their new image policy which introduces fees, we want to move using the image published by the Thanos team themselves to Quay.

However.. we've run into some major performance issues moving to the Quay image of the same version: Grafana is getting a large number of HTTP 504s from the Thanos Query endpoint, and I see the same when port forwarding to query the Thanos endpoint directly:

kubectl port-forward service/kube-prometheus-stack-thanos-query 9090:9090

time curl 'http://127.0.0.1:9090/api/v1/query_range?<random-query>'
curl: (52) Empty reply from server
curl   0.01s user 0.02s system 0% cpu 6:17.73 total

Or alternatively:

% curl -v 'http://127.0.0.1:9090/api/v1/query_range?<random-query>'
*   Trying 127.0.0.1:9090...
* Connected to 127.0.0.1 (127.0.0.1) port 9090
> GET /api/v1/query_range?<random-query> HTTP/1.1
> Host: 127.0.0.1:9090
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 503 Service Unavailable
< Access-Control-Allow-Headers: Accept, Accept-Encoding, Authorization, Content-Type, Origin
< Access-Control-Allow-Methods: GET, OPTIONS
< Access-Control-Allow-Origin: *
< Access-Control-Expose-Headers: Date
< Cache-Control: no-store
< Content-Type: application/json
< Vary: Accept-Encoding
< Date: Thu, 04 Sep 2025 16:12:09 GMT
< Content-Length: 92
<
{"status":"error","errorType":"timeout","error":"query timed out in expression evaluation"}

Looking in Thanos Query's logs I see repeated DeadlineExceeded messages:

ts=2025-09-04T15:29:43.107461171Z caller=endpointset.go:460 level=warn component=endpointset msg="update of endpoint failed" err="getting metadata: fallback fetching info from 172.32.86.46:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded" address=172.32.86.46:10901
ts=2025-09-04T15:29:42.625503501Z caller=endpointset.go:460 level=warn component=endpointset msg="update of endpoint failed" err="getting metadata: fallback fetching info from 172.32.61.200:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded" address=172.32.61.200:10901
ts=2025-09-04T15:30:22.246625944Z caller=endpointset.go:460 level=warn component=endpointset msg="update of endpoint failed" err="getting metadata: fallback fetching info from 172.32.61.200:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded" address=172.32.61.200:10901
ts=2025-09-04T15:30:22.24653799Z caller=endpointset.go:460 level=warn component=endpointset msg="update of endpoint failed" err="getting metadata: fallback fetching info from 172.32.86.46:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded" address=172.32.86.46:10901

I don't see anything logged relevant to this in the Prometheus pod or its Thanos sidecar.

I don't see any differences in the Dockerfiles between the Quay and Bitnami images in terms of runtime arguments. The docker images at the version I'm using are here:

There aren't changes to configuration or different environment variables in the image, if there are changes looks like it's at compile time and so not represented in the Dockerfile.

Has anyone else who has migrated run into similar issues, or any expert users got advice on how to debug these timeouts?

(Side note: I realise we are running an older Thanos version, happy to upgrade if there is reason to believe it would help, but our preference was to stick with the current application version while making the initial change between images to reduce the number of factors changing.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance issues updating from Bitnami Thanos image to Quay Thanos #8471

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Performance issues updating from Bitnami Thanos image to Quay Thanos #8471

Uh oh!

nb-ccd Sep 4, 2025

Replies: 0 comments

nb-ccd
Sep 4, 2025