Description
What is the problem you are trying to solve?
Right now, the mimir-distributed Helm chart doesn’t set a default GOMEMLIMIT value for all Mimir components. Additionally, manually setting GOMEMLIMIT via Helm can be cumbersome in complex deployments where multiple stacks have different resource allocations.
Ideally, GOMEMLIMIT should be automatically set based on a predefined ratio of a container's memory limit, with an option to override that ratio if needed. This would help ensure better memory management while reducing the risk of OOMKills.
Which solution do you envision (roughly)?
Similar to how Alloy handles memory limits, I’d suggest that Mimir use a package like automemlimit. This would automatically set GOMEMLIMIT to 90% of the container’s memory limit unless explicitly overridden via an environment variable.
Example code:
memLimit, err := memlimit.SetGoMemLimitWithOpts()
if err != nil {
level.Error(util_log.Logger).Log("msg", "Failed to set GOMEMLIMIT", "err", err)
} else {
level.Info(util_log.Logger).Log("msg", "Setting GOMEMLIMIT using automemlimit", "memory_limit_bytes", memLimit)
}
Have you considered any alternatives?
Pass in ENV via Helm
Manually setting GOMEMLIMIT via Helm can be cumbersome in complex deployments where multiple stacks have different resource allocations. Improving the mimir-distributed Helm chart to set sane values for GOMEMLIMIT could be a valid solution I suppose.
Mutating Webhook
Use a Kyverno policy to mutate pods and inject the GOMEMLIMIT env.
Any additional context to share?
No response
How long do you think this would take to be developed?
Small (<= 1 month dev)
What are the documentation dependencies?
No response
Proposer?
No response
Activity