Skip to content

Commit 277ae9d

Browse files
committed
up
1 parent df1d83c commit 277ae9d

2 files changed

Lines changed: 181 additions & 0 deletions

File tree

docs/vpa-resource-optimization.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,60 @@
22

33
How to use VPA, Goldilocks, and Kyverno to right-size Kubernetes resource requests based on actual workload behavior.
44

5+
## TL;DR — Just Tell Me What To Do
6+
7+
**Everything is automatic.** VPA is already watching every workload in the cluster. You don't need to set anything up.
8+
9+
### Step 1: Open the dashboard
10+
11+
Go to **https://goldilocks.vanillax.me** in your browser (must be on LAN/VPN).
12+
13+
### Step 2: Pick a namespace
14+
15+
Click any namespace (e.g., `argocd`, `immich`, `home-assistant`). You'll see every workload with its current resource settings and what VPA recommends.
16+
17+
### Step 3: Look for problems
18+
19+
The dashboard shows color-coded recommendations. Look for:
20+
- **Current request way below "Target"** = pod is starved, increase it
21+
- **Current request way above "Target"** = wasting resources, decrease it
22+
- **Current request below "Lower Bound"** = pod is actively throttled, fix ASAP
23+
24+
### Step 4: Apply changes
25+
26+
Edit the app's `values.yaml` in Git, update the `resources:` block, push, ArgoCD applies it. Add a comment explaining why:
27+
28+
```yaml
29+
# VPA-optimized (2026-02-24) — target was 2000m, previous 500m
30+
resources:
31+
requests:
32+
cpu: 2000m
33+
memory: 1Gi
34+
```
35+
36+
### Step 5: Wait and re-check
37+
38+
VPA recommendations update continuously. Check back in a week to see if the new values are good. Don't change things daily.
39+
40+
### Quick script to see all recommendations
41+
42+
```bash
43+
# Full report with human-readable values and action guidance
44+
./scripts/vpa-report.sh
45+
46+
# Filter to one namespace
47+
./scripts/vpa-report.sh argocd
48+
49+
# Or raw kubectl one-liner
50+
kubectl get vpa -A -o custom-columns=\
51+
NS:.metadata.namespace,\
52+
NAME:.metadata.name,\
53+
CPU:.status.recommendation.containerRecommendations[0].target.cpu,\
54+
MEM:.status.recommendation.containerRecommendations[0].target.memory
55+
```
56+
57+
---
58+
559
## The Toolchain
660

761
| Tool | What It Does | Location |

scripts/vpa-report.sh

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
#!/bin/bash
2+
# vpa-report.sh — Show VPA recommendations vs current resource requests
3+
# Usage: ./scripts/vpa-report.sh [namespace]
4+
# If no namespace given, shows all namespaces
5+
6+
set -euo pipefail
7+
8+
NS_FLAG=""
9+
if [[ "${1:-}" != "" ]]; then
10+
NS_FLAG="-n $1"
11+
else
12+
NS_FLAG="-A"
13+
fi
14+
15+
echo "=========================================="
16+
echo " VPA Resource Recommendations Report"
17+
echo "=========================================="
18+
echo ""
19+
20+
# Get all VPAs with recommendations
21+
kubectl get vpa $NS_FLAG -o json 2>/dev/null | python3 -c "
22+
import json, sys
23+
24+
def bytes_to_human(b):
25+
\"\"\"Convert bytes string to human-readable.\"\"\"
26+
try:
27+
n = int(b)
28+
except (ValueError, TypeError):
29+
return str(b)
30+
if n >= 1073741824:
31+
return f'{n/1073741824:.1f}Gi'
32+
elif n >= 1048576:
33+
return f'{n/1048576:.0f}Mi'
34+
elif n >= 1024:
35+
return f'{n/1024:.0f}Ki'
36+
return str(n)
37+
38+
def cpu_to_milli(cpu):
39+
\"\"\"Normalize CPU to millicores string.\"\"\"
40+
if cpu is None:
41+
return '?'
42+
s = str(cpu)
43+
if s.endswith('m'):
44+
return s
45+
try:
46+
return f'{int(float(s) * 1000)}m'
47+
except ValueError:
48+
return s
49+
50+
data = json.load(sys.stdin)
51+
items = data.get('items', [])
52+
53+
if not items:
54+
print('No VPA resources found.')
55+
sys.exit(0)
56+
57+
# Collect results
58+
results = []
59+
for vpa in items:
60+
ns = vpa['metadata']['namespace']
61+
name = vpa['metadata']['name']
62+
target_ref = vpa.get('spec', {}).get('targetRef', {})
63+
target_kind = target_ref.get('kind', '?')
64+
target_name = target_ref.get('name', '?')
65+
66+
recs = vpa.get('status', {}).get('recommendation', {}).get('containerRecommendations', [])
67+
if not recs:
68+
results.append({
69+
'ns': ns, 'name': name, 'kind': target_kind,
70+
'container': '-', 'cpu_target': 'waiting...', 'mem_target': 'waiting...',
71+
'cpu_lower': '-', 'cpu_upper': '-',
72+
'mem_lower': '-', 'mem_upper': '-',
73+
})
74+
continue
75+
76+
for rec in recs:
77+
target = rec.get('target', {})
78+
lower = rec.get('lowerBound', {})
79+
upper = rec.get('upperBound', {})
80+
results.append({
81+
'ns': ns,
82+
'name': name,
83+
'kind': target_kind,
84+
'container': rec.get('containerName', '?'),
85+
'cpu_target': cpu_to_milli(target.get('cpu')),
86+
'mem_target': bytes_to_human(target.get('memory', '?')),
87+
'cpu_lower': cpu_to_milli(lower.get('cpu')),
88+
'cpu_upper': cpu_to_milli(upper.get('cpu')),
89+
'mem_lower': bytes_to_human(lower.get('memory', '?')),
90+
'mem_upper': bytes_to_human(upper.get('memory', '?')),
91+
})
92+
93+
# Print table
94+
fmt = '{:<20} {:<35} {:<25} {:>10} {:>10} {:>10} {:>10}'
95+
print(fmt.format('NAMESPACE', 'WORKLOAD', 'CONTAINER', 'CPU TGT', 'CPU RANGE', 'MEM TGT', 'MEM RANGE'))
96+
print('-' * 145)
97+
98+
# Sort by namespace then name
99+
results.sort(key=lambda r: (r['ns'], r['name']))
100+
101+
for r in results:
102+
cpu_range = f'{r[\"cpu_lower\"]}-{r[\"cpu_upper\"]}'
103+
mem_range = f'{r[\"mem_lower\"]}-{r[\"mem_upper\"]}'
104+
print(fmt.format(
105+
r['ns'][:20],
106+
f'{r[\"kind\"]}/{r[\"name\"]}'[:35],
107+
r['container'][:25],
108+
r['cpu_target'],
109+
cpu_range[:10],
110+
r['mem_target'],
111+
mem_range[:10],
112+
))
113+
114+
print()
115+
print(f'Total: {len(results)} containers with VPA recommendations')
116+
print()
117+
print('Legend:')
118+
print(' CPU TGT = recommended CPU request (millicores)')
119+
print(' MEM TGT = recommended memory request')
120+
print(' RANGE = lowerBound-upperBound')
121+
print()
122+
print('Action needed if your current request is:')
123+
print(' < lowerBound → INCREASE NOW (pod is being throttled)')
124+
print(' < target → INCREASE (under-provisioned)')
125+
print(' ≈ target → KEEP (well-tuned)')
126+
print(' > 2x target → DECREASE (over-provisioned)')
127+
"

0 commit comments

Comments
 (0)