Skip to content

High cardinality Prometheus metric #8740

@zktaiga

Description

@zktaiga

Describe the bug

We are seeing high series cardinality from validator_monitor_prev_epoch_on_chain_balance because of the index label and no flag to disable per‑validator metric. This is generally an anti-pattern in Prometheus as it treats each label value as a new time series.

Code:

  • metric registration:
    // Validator Monitor Metrics (per-epoch summaries)
    // Only track prevEpochOnChainBalance per index
    prevEpochOnChainBalance: register.gauge<{index: number}>({
    name: "validator_monitor_prev_epoch_on_chain_balance",
    help: "Balance of validator after an epoch",
    labelNames: ["index"],
    }),
  • metric set each epoch:
    const balance = balances?.[index];
    if (balance !== undefined) {
    validatorMonitorMetrics?.prevEpochOnChainBalance.set({index}, balance);
    }
  • validator monitor created when metrics enabled:
    const validatorMonitor =
    opts.metrics.enabled || opts.validatorMonitor.validatorMonitorLogs
    ? createValidatorMonitor(
    metrics?.register ?? null,
    config,
    anchorState.genesisTime,
    logger.child({module: LoggerModule.vmon}),
    opts.validatorMonitor
    )

Request: have some way of limiting this cardinality / gate the registration of this metric.

In other clients, this has been solved via flags, where anything above some configurable threshold will return aggregates, not high cardinality metrics.

Expected behavior

Metrics to be of manageable cardinality.

Steps to reproduce

Running thousands of validators per client.

Additional context

No response

Operating system

Linux

Lodestar version or commit hash

v1.38.0

Metadata

Metadata

Assignees

Labels

meta-bugIssues that identify a bug and require a fix.scope-metricsAll issues with regards to the exposed metrics.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions