-
Notifications
You must be signed in to change notification settings - Fork 823
Labels
Description
Describe the bug
While upgrading from 1.18.1 to 1.19.0 some components required me adding the --distributor.shard-by-all-labels=true
arg and would not start otherwise with a failed to load runtime config
error.
Components:
- alertmanager
- compactor
- overrides-exporter
- query-frontend
- store-gateway
We do have ingester.max-global-series-per-user
set.
To Reproduce
Steps to reproduce the behavior:
- Set
distributor.shard-by-all-labels=true
in the configuration on the distributor and querier components (default isfalse
). - Use
global-series-per-user
within the tenant overrides. - Change Cortex image from 1.18.1 to 1.19.0
- Wait for restart of component and observe failure. See log excerpts below.
Expected behavior
I would think that only the distributor and querier would require this configuration, per the docs and was the case on previous versions.
Environment:
- Infrastructure: Kubernetes
- Deployment tool: jsonnet
Additional Context
While looking at the diff between the versions, this change to pkg/cortex/runtime_config.go
seemed relevant (or link to commit directly).
Log excerpts for each component:
alertmanager
:
alertmanager ts=2025-05-07T22:52:04.250387661Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
alertmanager ts=2025-05-07T22:52:04.250437925Z caller=cortex.go:451 level=error msg="module failed" module=memberlist-kv err="failed to start memberlist-kv, because it depends on module server, which has failed: invalid service state: Stopping, expected: Running"
alertmanager ts=2025-05-07T22:52:04.250453482Z caller=cortex.go:451 level=error msg="module failed" module=alertmanager err="failed to start alertmanager, because it depends on module server, which has failed: invalid service state: Stopping, expected: Running"
compactor
:
compactor ts=2025-05-07T21:07:05.974944401Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
compactor ts=2025-05-07T21:07:05.974970033Z caller=cortex.go:451 level=error msg="module failed" module=compactor err="failed to start compactor, because it depends on module runtime-config, which has failed: invalid service state: Failed, expected: Running, failure: invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
overrides-exporter
:
overrides-exporter ts=2025-05-07T23:46:00.915248501Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
query-frontend
:
query-frontend ts=2025-05-07T23:26:24.092666571Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
query-frontend ts=2025-05-07T23:26:24.09277482Z caller=cortex.go:451 level=error msg="module failed" module=query-frontend-tripperware err="failed to start query-frontend-tripperware, because it depends on module runtime-config, which has failed: invalid service state: Failed, expected: Running, failure: invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
query-frontend ts=2025-05-07T23:26:24.092791764Z caller=cortex.go:451 level=error msg="module failed" module=query-frontend err="failed to start query-frontend, because it depends on module query-frontend-tripperware, which has failed: context canceled"
store-gateway
:
store-gateway ts=2025-05-07T22:40:36.588298103Z caller=cortex.go:451 level=error msg="module failed" module=runtime-config err="invalid service state: Failed, expected: Running, failure: failed to load runtime config: load file: The ingester.max-global-series-per-user limit is unsupported if distributor.shard-by-all-labels is disabled"
store-gateway ts=2025-05-07T22:40:36.588335863Z caller=cortex.go:451 level=error msg="module failed" module=store-gateway err="failed to start store-gateway, because it depends on module memberlist-kv, which has failed: context canceled"
friedrichg
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
friedrichg commentedon May 14, 2025
the flag should never be disabled
We need to try #6021 again
SungJin1212 commentedon May 16, 2025
It seems due to https://github.com/cortexproject/cortex/pull/6340/files#diff-f70ef13978fead446903645dc3a53f599c9986caef0ee88bd079e46f09231f53R81-R86, but your
distributor.shard-by-all-labels
value istrue
. right?If you set it to
true
only for thedistributor
andquerier
, other components would fail at1.19.0
.yeya24 commentedon Jul 13, 2025
@SungJin1212 Nice find. This seems indeed a behavior change from that code path.
Help wanted I think we are able to fix this to check the enabled Cortex target to see if we want to perform the check.
SungJin1212 commentedon Jul 14, 2025
@yeya24
I would fix it.