diff --git a/skills/compare-llm-d-configurations/SKILL.md b/skills/compare-llm-d-configurations/SKILL.md index 3721174..3cc492c 100644 --- a/skills/compare-llm-d-configurations/SKILL.md +++ b/skills/compare-llm-d-configurations/SKILL.md @@ -93,6 +93,51 @@ Only needed if at least one run is being deployed fresh. Both runs share the sam 2. Check for an active `oc project`: `oc project -q 2>/dev/null` 3. Otherwise ask the user. +### 0.5 Check for Baseline Configuration + +If either run is labeled as "baseline" (doesnt use llm-d scheduling) or the user indicates one configuration is a baseline run, perform the following checks: + +1. **Verify baseline service exists**: + ```bash + kubectl get svc llm-d-baseline-model-server -n $NAMESPACE + ``` + +2. **If the service does not exist**, inform the user that you will create it, then apply the baseline service YAML: + ```bash + kubectl apply -f skills/compare-llm-d-configurations/llm-d-baseline-model-server-svc.yaml -n $NAMESPACE + ``` + + Or if that file is not available locally, create it inline: + ```bash + cat <.svc.cluster.local:8000 + ``` + --- ## Phase 1: Run A diff --git a/skills/compare-llm-d-configurations/llm-d-baseline-model-server-svc.yaml b/skills/compare-llm-d-configurations/llm-d-baseline-model-server-svc.yaml new file mode 100644 index 0000000..c887c66 --- /dev/null +++ b/skills/compare-llm-d-configurations/llm-d-baseline-model-server-svc.yaml @@ -0,0 +1,13 @@ +apiVersion: v1 +kind: Service +metadata: + name: llm-d-baseline-model-server +spec: + selector: + llm-d.ai/inference-serving: "true" + ports: + - name: http + protocol: TCP + port: 8000 + targetPort: 8000 + type: ClusterIP