Deploy OpenShell on an OpenShift cluster with a self-hosted vLLM model and GitHub authentication via Dex.
graph LR
User["User (browser)"] -- GitHub OAuth --> Dex["Dex (OIDC bridge)"]
Dex -- signed JWT --> Gateway["OpenShell Gateway"]
CLI["openshell CLI"] -- gRPC --> Gateway
Codex["Codex sandbox"] -- inference.local --> Proxy["Gateway Proxy"]
Proxy --> vLLM["vLLM (Gemma4-31B)"]
- vLLM serves a model (Gemma4-31B) with an OpenAI-compatible API
- OpenShell gateway manages sandboxes, inference routing, and authentication
- Dex bridges GitHub OAuth into standards-compliant OIDC (OpenShift and GitHub OAuth do not fully implement OIDC)
- Codex runs inside a sandbox and uses the self-hosted model through OpenShell's inference proxy
- OpenShift 4.x cluster with GPU nodes (NVIDIA GPU Operator installed)
ocCLI authenticated to your clusterhelmCLI installedopenshellCLI installed (brew install openshellor see OpenShell docs)- A Hugging Face token with access to the model
- A GitHub OAuth App (created in step 6)
Replace <CLUSTER_DOMAIN> with your OpenShift cluster domain throughout this guide (e.g., ocp.example.com — the apps domain becomes apps.ocp.example.com).
Create the Hugging Face token secret and apply the vLLM manifest:
oc apply -f manifests/vllm/deployment.yaml
oc create secret generic vllm-hf-token \
--from-literal=token=<YOUR_HF_TOKEN> \
-n vllmWait for the model to download and start (this can take 10+ minutes for large models):
oc rollout status deployment/gemma4 -n vllm --timeout=600sVerify the model is serving:
oc exec -n vllm deploy/gemma4 -- curl -s http://localhost:8000/v1/modelsYou should see gemma4-31b in the response.
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/latest/download/manifest.yamlCreate the namespace and set up security context constraints:
oc create namespace openshell
oc adm policy add-scc-to-user privileged -z openshell-sandbox -n openshellInstall OpenShell via Helm. Authentication is disabled initially — Dex is configured in step 7:
helm install openshell oci://ghcr.io/nvidia/openshell/helm-chart \
--version 0.0.55 \
-n openshell \
--set pkiInitJob.enabled=false \
--set server.disableTls=true \
--set server.auth.allowUnauthenticatedUsers=true \
--set podSecurityContext.fsGroup=null \
--set securityContext.runAsUser=nullWait for the gateway to become ready:
oc rollout status statefulset/openshell -n openshellPort-forward the gateway to your local machine:
oc port-forward svc/openshell 8080:8080 -n openshell &Register and select the gateway:
openshell gateway add http://127.0.0.1:8080 --name openshift --local
openshell gateway select openshiftVerify connectivity (should return No providers found, not an auth error):
openshell provider listCreate a provider pointing at the vLLM service inside the cluster:
openshell provider create \
--name gemma4 \
--type openai \
--credential OPENAI_API_KEY=unused \
--config OPENAI_BASE_URL=http://gemma4-svc.vllm.svc.cluster.local:8000/v1Route inference to the vLLM model:
openshell inference set --provider gemma4 --model gemma4-31bThe gateway validates the endpoint. You should see:
Validated Endpoints:
- http://gemma4-svc.vllm.svc.cluster.local:8000/v1/chat/completions (openai_chat_completions)
Go to GitHub > Settings > Developer settings > OAuth Apps > New OAuth App:
| Field | Value |
|---|---|
| Application name | OpenShell Dex |
| Homepage URL | https://dex-openshell.apps.<CLUSTER_DOMAIN> |
| Authorization callback URL | https://dex-openshell.apps.<CLUSTER_DOMAIN>/callback |
Note the Client ID and generate a Client Secret.
oc create secret generic dex-github-client \
--from-literal=client-id=<YOUR_GITHUB_CLIENT_ID> \
--from-literal=client-secret=<YOUR_GITHUB_CLIENT_SECRET> \
-n openshellEdit manifests/dex/dex.yaml and replace <CLUSTER_DOMAIN> with your cluster domain, then apply:
oc apply -f manifests/dex/dex.yaml
oc rollout status deployment/dex -n openshellVerify Dex is serving OIDC discovery:
curl -sk https://dex-openshell.apps.<CLUSTER_DOMAIN>/.well-known/openid-configuration | python3 -m json.toolUpgrade the Helm release to point at Dex. Setting adminRole and userRole to empty strings enables authentication-only mode (any valid GitHub user is accepted):
helm upgrade openshell oci://ghcr.io/nvidia/openshell/helm-chart \
--version 0.0.55 \
-n openshell \
--reuse-values \
--set server.auth.allowUnauthenticatedUsers=false \
--set server.oidc.issuer="https://dex-openshell.apps.<CLUSTER_DOMAIN>" \
--set server.oidc.audience="openshell-cli" \
--set server.oidc.adminRole="" \
--set server.oidc.userRole=""The Helm template has a known issue where empty string values for adminRole and userRole are silently dropped, causing the gateway to fall back to defaults (openshell-admin/openshell-user) and reject all users with PermissionDenied.
After the Helm upgrade, verify the ConfigMap:
oc get configmap openshell-config -n openshell -o jsonpath='{.data.gateway\.toml}' | grep -E 'admin_role|user_role'If those lines are missing, patch the ConfigMap to add them under [openshell.gateway.oidc]:
oc get configmap openshell-config -n openshell -o yaml > /tmp/openshell-config.yamlEdit /tmp/openshell-config.yaml and add these lines inside the [openshell.gateway.oidc] section:
admin_role = ""
user_role = ""Apply and restart:
oc apply -f /tmp/openshell-config.yaml
oc delete pod openshell-0 -n openshell
oc rollout status statefulset/openshell -n openshellRemove the existing (unauthenticated) gateway registration and re-add with OIDC:
openshell gateway remove openshift
openshell gateway add http://127.0.0.1:8080 \
--name openshift \
--oidc-issuer "https://dex-openshell.apps.<CLUSTER_DOMAIN>" \
--oidc-client-id "openshell-cli"Make sure the port-forward is still running, then log in:
openshell gateway login openshiftA browser window opens > Dex login page > "Log in with GitHub" > GitHub authorization > redirected back to CLI.
On success:
✓ Authenticated as <your-github-username>
Verify access:
openshell provider list
openshell sandbox create -- echo helloCreate a Codex provider and sandbox:
openshell provider create \
--name codex \
--type codex \
--credential OPENAI_API_KEY=placeholder
openshell sandbox create --provider codex
openshell sandbox connect <SANDBOX_NAME>Inside the sandbox, copy the Codex config and launch:
mkdir -p ~/.codex
cp /path/to/codex-config.toml ~/.codex/config.toml
codex -c model_provider=vllm -m gemma4-31bThe sample config is in config/codex-config.toml. It points Codex at the inference.local endpoint, which the gateway proxy intercepts, injects credentials, and forwards to vLLM. The sandbox never sees real API keys.
To verify vLLM is processing requests, watch its logs in a separate terminal:
oc logs -f deploy/gemma4 -n vllm --tail=1To limit authentication to members of a specific GitHub org or team, add an orgs block to the GitHub connector in manifests/dex/dex.yaml:
connectors:
- type: github
id: github
name: GitHub
config:
clientID: $GITHUB_CLIENT_ID
clientSecret: $GITHUB_CLIENT_SECRET
redirectURI: https://dex-openshell.apps.<CLUSTER_DOMAIN>/callback
orgs:
- name: my-org
teams:
- my-teamRe-apply and restart Dex:
oc apply -f manifests/dex/dex.yaml
oc rollout restart deployment/dex -n openshell- vLLM version: v0.19+ is required for Responses API support (
/v1/responses), which Codex uses. - Codex version: v0.117.0+ supports custom model providers via
config.toml. - Dex version: v2.42.0+ is required for RFC 8252 loopback redirect matching (implicit localhost redirect URIs for public clients).
- GPU requirements: Gemma4-31B requires at least 2x NVIDIA GPUs with tensor parallelism. Adjust
tensor-parallel-sizeand resource requests for your hardware. - Inference routing: The
inference.localendpoint is only accessible from inside sandboxes. The gateway proxy intercepts requests, injects credentials, and forwards to vLLM. - Self-signed certificates: If your cluster uses self-signed ingress certs, see docs/troubleshooting.md.
See docs/troubleshooting.md for common errors and fixes.