Skip to content

Commit 6b1852b

Browse files
committed
update inferno metrics naming to wva
Signed-off-by: Mohammed Abdi <mohammed.munir.abdi@ibm.com>
1 parent 0bfd94d commit 6b1852b

27 files changed

Lines changed: 131 additions & 132 deletions

File tree

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -256,7 +256,7 @@ kubectl logs -n workload-variant-autoscaler-system \
256256
kubectl describe variantautoscaling <name> -n <namespace>
257257

258258
# Check emitted metrics
259-
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/inferno_desired_replicas" | jq
259+
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/wva_desired_replicas" | jq
260260
```
261261

262262
### Cleaning Up

charts/workload-variant-autoscaler/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -300,5 +300,5 @@ kubectl logs pod prometheus-adapter-xxxxx -n openshift-user-workload-monitoring
300300
```
301301
3. Check, after a few minutes following installation, for metric collection
302302
```
303-
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/$NAMESPACE/inferno_desired_replicas" | jq
303+
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/$NAMESPACE/wva_desired_replicas" | jq
304304
```

charts/workload-variant-autoscaler/templates/hpa.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ spec:
3636
- type: External
3737
external:
3838
metric:
39-
name: inferno_desired_replicas
39+
name: wva_desired_replicas
4040
selector:
4141
matchLabels:
4242
variant_name: {{ printf "%s-decode" .Values.llmd.modelName }}

config/samples/hpa-integration.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ spec:
2727
- type: External
2828
external:
2929
metric:
30-
name: inferno_desired_replicas
30+
name: wva_desired_replicas
3131
selector:
3232
matchLabels:
3333
variant_name: vllme-deployment

config/samples/prometheus-adapter-values-ocp.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,15 @@ prometheus:
44

55
rules:
66
external:
7-
- seriesQuery: 'inferno_desired_replicas{variant_name!="",exported_namespace!=""}'
7+
- seriesQuery: 'wva_desired_replicas{variant_name!="",exported_namespace!=""}'
88
resources:
99
overrides:
1010
exported_namespace: {resource: "namespace"}
1111
variant_name: {resource: "deployment"}
1212
name:
13-
matches: "^inferno_desired_replicas"
14-
as: "inferno_desired_replicas"
15-
metricsQuery: 'inferno_desired_replicas{<<.LabelMatchers>>}'
13+
matches: "^wva_desired_replicas"
14+
as: "wva_desired_replicas"
15+
metricsQuery: 'wva_desired_replicas{<<.LabelMatchers>>}'
1616

1717
replicas: 2
1818
logLevel: 4

config/samples/prometheus-adapter-values.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,15 @@ prometheus:
44

55
rules:
66
external:
7-
- seriesQuery: 'inferno_desired_replicas{variant_name!="",exported_namespace!=""}'
7+
- seriesQuery: 'wva_desired_replicas{variant_name!="",exported_namespace!=""}'
88
resources:
99
overrides:
1010
exported_namespace: {resource: "namespace"}
1111
variant_name: {resource: "deployment"}
1212
name:
13-
matches: "^inferno_desired_replicas"
14-
as: "inferno_desired_replicas"
15-
metricsQuery: 'inferno_desired_replicas{<<.LabelMatchers>>}'
13+
matches: "^wva_desired_replicas"
14+
as: "wva_desired_replicas"
15+
metricsQuery: 'wva_desired_replicas{<<.LabelMatchers>>}'
1616

1717
replicas: 2
1818
logLevel: 4

deploy/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -538,7 +538,7 @@ spec:
538538
- type: External
539539
external:
540540
metric:
541-
name: inferno_desired_replicas
541+
name: wva_desired_replicas
542542
selector:
543543
matchLabels:
544544
variant_name: my-vllm-deployment-decode
@@ -717,7 +717,7 @@ kubectl port-forward -n <monitoring-namespace> svc/prometheus-k8s 9090:9090
717717
kubectl logs -n workload-variant-autoscaler-system -l app.kubernetes.io/name=workload-variant-autoscaler | grep "Collected metrics"
718718

719719
# 4. Verify external metrics API (if using HPA)
720-
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/inferno_desired_replicas" | jq
720+
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/wva_desired_replicas" | jq
721721
```
722722

723723
### Monitoring WVA
@@ -807,7 +807,7 @@ watch kubectl get hpa -n <namespace>
807807
watch kubectl get pods -n <namespace>
808808

809809
# Watch external metrics
810-
watch 'kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/inferno_desired_replicas" | jq'
810+
watch 'kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/wva_desired_replicas" | jq'
811811
```
812812

813813
## Troubleshooting
@@ -913,14 +913,14 @@ my-hpa Deployment/vllm <unknown>/1(avg) 1 10 1
913913
kubectl describe hpa <name> -n <namespace>
914914

915915
# Check external metrics API on the specified namespace
916-
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<your-namespace>/inferno_desired_replicas" | jq
916+
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<your-namespace>/wva_desired_replicas" | jq
917917

918918
# Check Prometheus Adapter logs
919919
kubectl logs -n <monitoring-namespace> deployment/prometheus-adapter
920920

921921
# Check if WVA is emitting the metric
922922
kubectl logs -n workload-variant-autoscaler-system -l app.kubernetes.io/name=workload-variant-autoscaler | \
923-
grep "inferno_desired_replicas"
923+
grep "wva_desired_replicas"
924924
```
925925

926926
**Common causes**:
@@ -941,7 +941,7 @@ kubectl get configmap prometheus-adapter -n <monitoring-namespace> -o yaml
941941

942942
# Verify metric exists in Prometheus
943943
kubectl port-forward -n <monitoring-namespace> svc/prometheus-k8s 9090:9090
944-
# Query: inferno_desired_replicas{variant_name="<name>"}
944+
# Query: wva_desired_replicas{variant_name="<name>"}
945945
```
946946

947947
### Getting Help

deploy/install.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -732,7 +732,7 @@ print_summary() {
732732
echo " kubectl logs -n $WVA_NS -l app.kubernetes.io/name=workload-variant-autoscaler -f"
733733
echo ""
734734
echo "4. Check external metrics API:"
735-
echo " kubectl get --raw \"/apis/external.metrics.k8s.io/v1beta1/namespaces/$LLMD_NS/inferno_desired_replicas\" | jq"
735+
echo " kubectl get --raw \"/apis/external.metrics.k8s.io/v1beta1/namespaces/$LLMD_NS/wva_desired_replicas\" | jq"
736736
echo ""
737737
echo "5. Port-forward Prometheus to view metrics:"
738738
echo " kubectl port-forward -n $MONITORING_NAMESPACE svc/${PROMETHEUS_SVC_NAME} ${PROMETHEUS_PORT}:${PROMETHEUS_PORT}"

deploy/kubernetes/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -445,7 +445,7 @@ kubectl describe variantautoscaling ms-inference-scheduling-llm-d-modelservice-d
445445
kubectl get hpa -n llm-d-inference-scheduling
446446

447447
# Check external metrics
448-
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/llm-d-inference-scheduling/inferno_desired_replicas" | jq
448+
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/llm-d-inference-scheduling/wva_desired_replicas" | jq
449449
```
450450

451451
### Monitor WVA Logs (See Metrics Validation!)
@@ -470,7 +470,7 @@ kubectl port-forward -n workload-variant-autoscaler-monitoring \
470470

471471
# Visit http://localhost:9090
472472
# Query: vllm:request_success_total
473-
# Query: inferno_desired_replicas
473+
# Query: wva_desired_replicas
474474
```
475475

476476
### Access Grafana Dashboards

deploy/openshift/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -311,7 +311,7 @@ export HF_TOKEN="hf_xxxxxxxxxxxxxxxxxxxxx"
311311

312312
```bash
313313
kubectl get pods -n openshift-user-workload-monitoring | grep prometheus-adapter
314-
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/llm-d-inference-scheduler/inferno_desired_replicas" | jq
314+
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/llm-d-inference-scheduler/wva_desired_replicas" | jq
315315
```
316316

317317
### vLLM Pods Not Starting
@@ -344,7 +344,7 @@ kubectl get variantautoscaling -n llm-d-inference-scheduler
344344
kubectl get hpa -n llm-d-inference-scheduler
345345

346346
# Check external metrics
347-
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/llm-d-inference-scheduler/inferno_desired_replicas" | jq
347+
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/llm-d-inference-scheduler/wva_desired_replicas" | jq
348348
```
349349

350350
### Monitor WVA Logs

0 commit comments

Comments
 (0)