Skip to content

Commit a4b10ca

Browse files
committed
Clamp wiggle room at 2GB / 2 CPU
Otherwise we were losing a lot of resources on large nodes
1 parent 8ecb07e commit a4b10ca

File tree

1 file changed

+8
-10
lines changed

1 file changed

+8
-10
lines changed

deployer/commands/generate/resource_allocation/generate_choices.py

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -48,16 +48,14 @@ def proportional_memory_strategy(
4848
# We operate on *available* memory, which already accounts for system components (like kubelet & systemd)
4949
# as well as daemonsets we run on every node. This represents the resources that are available
5050
# for user pods.
51-
52-
WIGGLE_ROOM = 0.02
53-
54-
available_node_mem = nodeinfo["available"]["memory"] * (1 - WIGGLE_ROOM)
55-
available_node_cpu = nodeinfo["available"]["cpu"] * (1 - WIGGLE_ROOM)
56-
57-
# Only show one digit after . for CPU, but round *down* not up so we never
58-
# say they are getting more CPU than our limit is set to. We multiply & divide
59-
# with a floor, as otherwise 3.75 gets rounded to 3.8, not 3.7
60-
cpu_display = math.floor(available_node_cpu * 10) / 10
51+
# In addition, we provide some wiggle room to account for additional daemonset requests or other
52+
# issues that may pop up due to changes outside our control (like k8s upgrades). This is either
53+
# 2% of the available capacity, or 2GB / 1 CPU (whichever is smaller)
54+
mem_overhead_wiggle = min(nodeinfo["available"]["memory"] * 0.02, 2 * 1024 * 1024 * 1024)
55+
cpu_overhead_wiggle = min(nodeinfo["available"]["cpu"] * 0.02, 1)
56+
57+
available_node_mem = nodeinfo["available"]["memory"] - mem_overhead_wiggle
58+
available_node_cpu = nodeinfo["available"]["cpu"] - cpu_overhead_wiggle
6159

6260
# We always start from the top, and provide a choice that takes up the whole node.
6361
mem_limit = available_node_mem

0 commit comments

Comments
 (0)