Conversation
Collaborator
Author
|
/ok-to-test |
Contributor
|
🚀 Kind E2E (full) triggered by |
Contributor
|
🚀 OpenShift E2E — approve and run ( |
Contributor
GPU Pre-flight Check ✅GPUs are available for e2e-openshift tests. Proceeding with deployment.
|
Collaborator
|
/ok-to-test |
Contributor
|
🚀 Kind E2E (full) triggered by |
Contributor
|
🚀 OpenShift E2E — approve and run ( |
Contributor
GPU Pre-flight Check ❌Insufficient GPUs to run OpenShift E2E. Re-run with
|
Contributor
GPU Pre-flight Check ✅GPUs are available for e2e-openshift tests. Proceeding with deployment.
|
Contributor
GPU Pre-flight Check ✅GPUs are available for e2e-openshift tests. Proceeding with deployment.
|
Collaborator
|
/ok-to-test |
Contributor
|
🚀 Kind E2E (full) triggered by |
Contributor
|
🚀 OpenShift E2E — approve and run ( |
Contributor
GPU Pre-flight Check ✅GPUs are available for e2e-openshift tests. Proceeding with deployment.
|
kahilam
reviewed
Apr 15, 2026
lionelvillard
approved these changes
Apr 15, 2026
kahilam
added a commit
that referenced
this pull request
Apr 15, 2026
Addresses review feedback from #1014 to move away from bash deployment scripts for readability, type safety, and concurrent model deployment. Key improvements: - Models 2..N deploy concurrently via goroutines (bash was sequential) - Connectivity verification uses kubectl port-forward from the Go process, eliminating the in-cluster curl Job and its Docker Hub image (curlimages/curl:latest) - Kubernetes resources (Gateway, HTTPRoute) created via dynamic client instead of heredoc YAML - Proper error handling and structured logging The Go tool is invoked via `go run ./deploy/multimodel` from the same Makefile targets (deploy-multi-model-infra, undeploy-multi-model-infra). Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces a new design for installing multi-model infra via a wrapper around the current infra install scripts. This avoids changing the existing e2e infrastructure that runs several other tests. It also contains a separate make command to run multi-model tests that are in sync with existing single-model benchmark tests. Gemini was used to help with coding.
This is mostly dormant code at this point that has not been connected to CI. The following commands can help run this PR in a namespace: