Skip to content

[1.19.0] While upgrading, --distributor.shard-by-all-labels is now required on non-related components. #6741

@EpiJunkie

Description

@EpiJunkie

Describe the bug
While upgrading from 1.18.1 to 1.19.0 some components required me adding the --distributor.shard-by-all-labels=true arg and would not start otherwise with a failed to load runtime config error.

Components:

  • alertmanager
  • compactor
  • overrides-exporter
  • query-frontend
  • store-gateway

We do have ingester.max-global-series-per-user set.

To Reproduce
Steps to reproduce the behavior:

  1. Set distributor.shard-by-all-labels=true in the configuration on the distributor and querier components (default is false).
  2. Use global-series-per-user within the tenant overrides.
  3. Change Cortex image from 1.18.1 to 1.19.0
  4. Wait for restart of component and observe failure. See log excerpts below.

Expected behavior
I would think that only the distributor and querier would require this configuration, per the docs and was the case on previous versions.

Environment:

  • Infrastructure: Kubernetes
  • Deployment tool: jsonnet

Additional Context

While looking at the diff between the versions, this change to pkg/cortex/runtime_config.go seemed relevant (or link to commit directly).

Log excerpts for each component:

alertmanager:

alertmanager ts=2025-05-07T22:52:04.250387661Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
alertmanager ts=2025-05-07T22:52:04.250437925Z caller=cortex.go:451 level=error msg="module failed" module=memberlist-kv err="failed to start memberlist-kv, because it depends on module server, which has failed: invalid service state: Stopping, expected: Running"
alertmanager ts=2025-05-07T22:52:04.250453482Z caller=cortex.go:451 level=error msg="module failed" module=alertmanager err="failed to start alertmanager, because it depends on module server, which has failed: invalid service state: Stopping, expected: Running"

compactor:

compactor ts=2025-05-07T21:07:05.974944401Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
compactor ts=2025-05-07T21:07:05.974970033Z caller=cortex.go:451 level=error msg="module failed" module=compactor err="failed to start compactor, because it depends on module runtime-config, which has failed: invalid service state: Failed, expected: Running, failure: invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"

overrides-exporter:

overrides-exporter ts=2025-05-07T23:46:00.915248501Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"

query-frontend:

query-frontend ts=2025-05-07T23:26:24.092666571Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
query-frontend ts=2025-05-07T23:26:24.09277482Z caller=cortex.go:451 level=error msg="module failed" module=query-frontend-tripperware err="failed to start query-frontend-tripperware, because it depends on module runtime-config, which has failed: invalid service state: Failed, expected: Running, failure: invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
query-frontend ts=2025-05-07T23:26:24.092791764Z caller=cortex.go:451 level=error msg="module failed" module=query-frontend err="failed to start query-frontend, because it depends on module query-frontend-tripperware, which has failed: context canceled"

store-gateway:

store-gateway ts=2025-05-07T22:40:36.588298103Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
store-gateway ts=2025-05-07T22:40:36.588335863Z caller=cortex.go:451 level=error msg="module failed" module=store-gateway err="failed to start store-gateway, because it depends on module memberlist-kv, which has failed: context canceled"

Activity

added theissue type on May 13, 2025
friedrichg

friedrichg commented on May 14, 2025

@friedrichg
Member

the flag should never be disabled
We need to try #6021 again

SungJin1212

SungJin1212 commented on May 16, 2025

@SungJin1212
Member

It seems due to https://github.com/cortexproject/cortex/pull/6340/files#diff-f70ef13978fead446903645dc3a53f599c9986caef0ee88bd079e46f09231f53R81-R86, but your distributor.shard-by-all-labels value is true. right?

If you set it to true only for the distributor and querier, other components would fail at 1.19.0.

yeya24

yeya24 commented on Jul 13, 2025

@yeya24
Contributor

@SungJin1212 Nice find. This seems indeed a behavior change from that code path.

Help wanted I think we are able to fix this to check the enabled Cortex target to see if we want to perform the check.

self-assigned this
on Jul 14, 2025
SungJin1212

SungJin1212 commented on Jul 14, 2025

@SungJin1212
Member

@yeya24
I would fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Participants

    @friedrichg@EpiJunkie@SungJin1212@yeya24

    Issue actions

      [1.19.0] While upgrading, --distributor.shard-by-all-labels is now required on non-related components. · Issue #6741 · cortexproject/cortex