Convert multi-model deploy script from bash to Go#1015
Conversation
Addresses review feedback from #1014 to move away from bash deployment scripts for readability, type safety, and concurrent model deployment. Key improvements: - Models 2..N deploy concurrently via goroutines (bash was sequential) - Connectivity verification uses kubectl port-forward from the Go process, eliminating the in-cluster curl Job and its Docker Hub image (curlimages/curl:latest) - Kubernetes resources (Gateway, HTTPRoute) created via dynamic client instead of heredoc YAML - Proper error handling and structured logging The Go tool is invoked via `go run ./deploy/multimodel` from the same Makefile targets (deploy-multi-model-infra, undeploy-multi-model-infra). Made-with: Cursor
GPU Pre-flight Check ✅GPUs are available for e2e-openshift tests. Proceeding with deployment.
|
- Add INSTALL_GATEWAY_CTRLPLANE to Makefile passthrough (default: false for shared clusters with existing Istio) - Set E2E_TESTS_ENABLED=true to prevent interactive prompts in install.sh - Guard modelArtifacts.labels yq call in infra_llmd.sh to avoid schema validation errors on chart versions that don't support custom labels - Remove unused import in main.go Made-with: Cursor
GPU Pre-flight Check ✅GPUs are available for e2e-openshift tests. Proceeding with deployment.
|
|
Thanks, please test this PR locally with command similar to: Please share a snippet of the test being passed. I know this is manual for now. |
|
Yes, tested locally on OpenShift cluster (wva-bench-test namespace). Full deploy + undeploy + benchmark run completed successfully: Both models deployed and reachable through Gateway |
|
@kahilam linter tests are failing, please fix. |
- Replace fmt.Sprintf with string concatenation (perfsprint) - Preallocate rules slice (prealloc) - Remove unused ctx param from detectInferencePoolAPIGroup (unparam) - Change verifyInferencePools to not return always-nil error (unparam) - Remove unused portStr method and strconv import (unused) Made-with: Cursor
- Use strconv.Itoa instead of fmt.Sprintf for int conversion (perfsprint) - Use string concatenation for svc/ prefix (perfsprint) - Remove trailing blank line in portforward.go (gofmt) Made-with: Cursor
@kahilam, with the new changes, could you please share the output of the command above? |
|
Re-ran the full test with the latest lint fix commits ( Results:
|
|
@asm582 FYI. Full benchmark output (click to expand) |
|
/lgtm |
|
/ok-to-test |
|
🚀 Kind E2E (full) triggered by |
|
🚀 OpenShift E2E — approve and run ( |
|
The VA=1 issue--> The WVA controller is stuck in a transition loop: HPA scales to 2, VA still desires 1, sees desired(1)!=current(2) as "in transition", blocks its own scale-up decision, and keeps trying to scale down to 1. Also seeing pod/pod_name label mismatch for dispatch rate metrics on all pods. This is an HPA/VA conflict in the test setup, not related to the Go conversion. |
|
Thanks for this PR, I do see scale up and scale down: |
|
/lgtm |
Remove the modelArtifacts.labels guard that was added as a workaround for older chart versions. This change is out of scope for the bash-to-Go conversion PR. Made-with: Cursor
|
Re-tested after reverting Results:
|
|
@asm582 Reverted deploy/lib/infra_llmd.sh to match main. Re-tested the full deploy + benchmark cycle — all passing. Results posted above. |
|
/lgtm |
|
@lionelvillard , would you please approve this PR? |
|
/ok-to-test |
|
🚀 Kind E2E (full) triggered by |
|
🚀 OpenShift E2E — approve and run ( |
|
/ok-to-test |
|
🚀 OpenShift E2E — approve and run ( |
|
🚀 Kind E2E (full) triggered by |
Summary
AI-assisted using Cursor IDE.
Converts
deploy/install-multi-model.sh(393-line bash script) into a Go tool atdeploy/multimodel/, addressing review feedback from #1014 (comment) to move away from bash deployment scripts for readability, performance, and enabling concurrent test execution within a single GH workflow run.Key improvements over the bash version:
kubectl port-forwardfrom the Go process, eliminating the in-clustercurlimages/curl:latestJobset -eFiles changed:
deploy/multimodel/main.go,deployer.go,portforward.godeploy/install-multi-model.shMakefile— updated targets to usego run ./deploy/multimodel, addedINSTALL_GATEWAY_CTRLPLANEpassthroughdeploy/lib/infra_llmd.sh— guardedmodelArtifacts.labelsyq call for chart compatibilityFunctional parity:
The Go tool accepts the same environment variables (
MODELS,LLMD_NS,ENVIRONMENT,DECODE_REPLICAS, etc.) and CLI flags (--undeploy) as the bash script. It still delegates todeploy/install.shfor per-model Helm deployments.Tested on OpenShift cluster:
Qwen/Qwen3-0.6B,unsloth/Meta-Llama-3.1-8B)SUCCESS! -- 1 Passed | 0 Failed)