Skip to content

Thanos Query - PromQL Engine Thanos + Distributed with new cluster #8515

@Poil

Description

@Poil

Thanos, Prometheus and Golang version used:
Thanos 0.39.2

Object Storage Provider:
S3

What happened:
I create a new EKS cluster, I'm in distributed mode with a central Query (+Query Frontend)

We use sharded store configuration

    sharded:
      enabled: true
      timePartitioning:
        # One store for data older than 6 weeks
        - min: ""
          max: -6w

        # One store for data newer than 6 weeks and older than 2 weeks
        - min: -6w
          max: -2w

        # One store for data newer than 2 weeks
        - min: -2w
          max: ""

For this new cluster, I'm unable to retrieve data older than "Thanos Sidecar configured value"
It looks like it's not fetching "stores"

It's probably because the store 0 & 1 have actually no data, so no Announced Labelset and min/max time, only Store 2 (less than 2 weeks) has it

EDIT
I removed "Sharded Stores", not better ...

  • Central Query Config
    replicaLabel: ['prometheus_replica', 'replica']
    extraFlags:
      - "--enable-auto-gomemlimit"
      - "--auto-gomemlimit.ratio=0.9"
      - "--query.auto-downsampling"
      - "--query.promql-engine=thanos"
      - "--query.mode=distributed"
      - "--query.partial-response"
    sdConfig:
      - targets:
          - 'thanos-query.production-aaa-eu-west-3-eks.xxx.ai:30901' 
          - 'thanos-query.production-aaa-eu-south-1-eks.xxx.ai:30901' # new cluster where we have the problem
          - 'thanos-query.production-aaa-us-east-1-eks.xxxx.ai:30901'
          - 'thanos-query.production-bb-eu-central-1-eks.xxxx.ai:30901'
          - 'thanos-query.production-bb-eu-west-1-eks.xxxx.ai:30901'
  • Clusters Query Config
                  - "--enable-auto-gomemlimit"
                  - "--auto-gomemlimit.ratio=0.9"
                  - "--query.promql-engine=thanos"
                  - "--query.partition-label=cluster"
                  - '--selector-label=cluster="{{ index (splitList "." .config.path.filenameNormalized) 0 }}"' ## the cluster name

Full logs to relevant components:
Absolutely no error/warn logs

│ 2025-10-13T09:13:15.037121556Z ts=2025-10-13T09:13:15.036900064Z caller=options.go:29 level=info protocol=gRPC msg="disabled TLS, key and cert must be set to enable"                                                                                                                                                     ││ 2025-10-13T09:13:15.037751413Z ts=2025-10-13T09:13:15.037602681Z caller=query.go:634 level=info msg="starting query node"                                                                                                                                                                                                 ││ 2025-10-13T09:13:15.038392599Z ts=2025-10-13T09:13:15.038293538Z caller=intrumentation.go:75 level=info msg="changing probe status" status=healthy                                                                                                                                                                        ││ 2025-10-13T09:13:15.038549791Z ts=2025-10-13T09:13:15.038380989Z caller=http.go:72 level=info service=http/server component=query msg="listening for requests and metrics" address=0.0.0.0:10902                                                                                                                          ││ 2025-10-13T09:13:15.039264778Z ts=2025-10-13T09:13:15.038951335Z caller=handler.go:87 level=info service=http/server component=query caller=tls_config.go:347 time=2025-10-13T09:13:15.038930885Z msg="Listening on" address=[::]:10902                                                                                   ││ 2025-10-13T09:13:15.039273788Z ts=2025-10-13T09:13:15.038989785Z caller=handler.go:87 level=info service=http/server component=query caller=tls_config.go:350 time=2025-10-13T09:13:15.038982565Z msg="TLS is disabled." http2=false address=[::]:10902                                                                   ││ 2025-10-13T09:13:15.039278108Z ts=2025-10-13T09:13:15.039031206Z caller=intrumentation.go:56 level=info msg="changing probe status" status=ready                                                                                                                                                                          ││ 2025-10-13T09:13:15.039282068Z ts=2025-10-13T09:13:15.039082546Z caller=grpc.go:167 level=info service=gRPC/server component=query msg="listening for serving gRPC" address=0.0.0.0:10901                                                                                                                                 ││ 2025-10-13T09:13:20.049369193Z ts=2025-10-13T09:13:20.049170431Z caller=endpointset.go:346 level=info component=endpointset msg="adding new sidecar with [storeEndpoints rulesAPI exemplarsAPI targetsAPI MetricMetadataAPI]" address=100.66.40.99:10901 extLset="{cluster=\"production-cmpt-eu-south-1-eks\", prometheus ││ 2025-10-13T09:13:20.049412793Z ts=2025-10-13T09:13:20.049234012Z caller=endpointset.go:346 level=info component=endpointset msg="adding new sidecar with [storeEndpoints rulesAPI exemplarsAPI targetsAPI MetricMetadataAPI]" address=100.66.44.39:10901 extLset="{cluster=\"production-cmpt-eu-south-1-eks\", prometheus ││ 2025-10-13T09:13:20.049417483Z ts=2025-10-13T09:13:20.049252242Z caller=endpointset.go:346 level=info component=endpointset msg="adding new sidecar with [storeEndpoints rulesAPI exemplarsAPI targetsAPI MetricMetadataAPI]" address=100.66.44.33:10901 extLset="{cluster=\"production-cmpt-eu-south-1-eks\", prometheus ││ 2025-10-13T09:13:20.049421444Z ts=2025-10-13T09:13:20.049266482Z caller=endpointset.go:346 level=info component=endpointset msg="adding new sidecar with [storeEndpoints rulesAPI exemplarsAPI targetsAPI MetricMetadataAPI]" address=100.66.40.97:10901 extLset="{cluster=\"production-cmpt-eu-south-1-eks\", prometheus ││ 2025-10-13T09:13:20.049425234Z ts=2025-10-13T09:13:20.049279572Z caller=endpointset.go:346 level=info component=endpointset msg="adding new sidecar with [storeEndpoints rulesAPI exemplarsAPI targetsAPI MetricMetadataAPI]" address=100.66.44.32:10901 extLset="{cluster=\"production-cmpt-eu-south-1-eks\", prometheus ││ 2025-10-13T09:13:20.049572305Z ts=2025-10-13T09:13:20.049292402Z caller=endpointset.go:346 level=info component=endpointset msg="adding new sidecar with [storeEndpoints rulesAPI exemplarsAPI targetsAPI MetricMetadataAPI]" address=100.66.40.106:10901 extLset="{cluster=\"production-cmpt-eu-south-1-eks\", prometheu ││ 2025-10-13T09:13:20.049642396Z ts=2025-10-13T09:13:20.049392963Z caller=endpointset.go:346 level=info component=endpointset msg="adding new store with [storeEndpoints]" address=100.66.50.22:10901 extLset="{\"@thanos_compatibility_store_type\"=\"store\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"t ││ 2025-10-13T09:13:20.049648696Z ts=2025-10-13T09:13:20.049421744Z caller=endpointset.go:346 level=info component=endpointset msg="adding new sidecar with [storeEndpoints rulesAPI exemplarsAPI targetsAPI MetricMetadataAPI]" address=100.66.44.42:10901 extLset="{cluster=\"production-cmpt-eu-south-1-eks\", prometheus ││ 2025-10-13T09:13:20.049662456Z ts=2025-10-13T09:13:20.049467864Z caller=endpointset.go:346 level=info component=endpointset msg="adding new sidecar with [storeEndpoints rulesAPI exemplarsAPI targetsAPI MetricMetadataAPI]" address=100.66.44.35:10901 extLset="{cluster=\"production-cmpt-eu-south-1-eks\", prometheus

Query Debug

ts=2025-10-13T09:25:32.109403983Z caller=proxy.go:291 level=debug msg="Tenant info in Series()" tenant=default-tenant
ts=2025-10-13T09:25:32.109902537Z caller=proxy.go:337 level=debug component=proxy request="min_time:1760347140001 max_time:1760347520000 matchers:<name:\"__name__\" value:\"thanos_objstore_bucket_last_successful_upload_time\" > matchers:<type:RE name:\"cluster\" value:\"production-cmpt-eu-south-1-eks\" > matchers:<name:\"job\" value:\"thanos-compactor\" > max_resolution_window:4 aggregates:MAX without_replica_labels:\"prometheus_replica\" without_replica_labels:\"replica\" " msg="Series: started fanout streams" status="Store Addr: 100.66.40.106:10901 LabelSets: {cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-1-0\"} MinTime: 1759836569000 MaxTime: 9223372036854775807 queried;Store Addr: 100.66.44.33:10901 LabelSets: {cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-1-1\"} MinTime: 1759836567000 MaxTime: 9223372036854775807 queried;Store Addr: 100.66.44.32:10901 LabelSets: {cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-3-1\"} MinTime: 1759836567000 MaxTime: 9223372036854775807 queried;Store Addr: 100.66.44.39:10901 LabelSets: {cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-2-1\"} MinTime: 1759836582000 MaxTime: 9223372036854775807 queried;Store Addr: 100.66.40.99:10901 LabelSets: {cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-0\"} MinTime: 1759836585000 MaxTime: 9223372036854775807 queried;Store Addr: 100.66.40.97:10901 LabelSets: {cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-3-0\"} MinTime: 1759836582000 MaxTime: 9223372036854775807 queried;Store Addr: 100.66.28.7:10901 LabelSets: {\"@thanos_compatibility_store_type\"=\"store\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-0\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-1\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-1-0\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-1-1\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-2-0\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-2-1\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-3-0\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-3-1\"},{cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\"} MinTime: 1759836567181 MaxTime: 1760342400000 filtered out due to: does not have data within this time period: [1760347140001,1760347520000]. Store time ranges: [1759836567181,1760342400000];Store Addr: 100.66.44.35:10901 LabelSets: {cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-shard-2-0\"} MinTime: 1759836583000 MaxTime: 9223372036854775807 queried;Store Addr: 100.66.44.42:10901 LabelSets: {cluster=\"production-cmpt-eu-south-1-eks\", prometheus=\"tooling/kube-prometheus-prometheus\", prometheus_replica=\"prometheus-kube-prometheus-prometheus-1\"} MinTime: 1759836575000 MaxTime: 9223372036854775807 queried"

Notes

Everything works when using PromQL Engine Prometheus

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions