Deploy Single Node OpenShift using the Agent-Based Installer with Cilium Enterprise as Day 1 CNI. The pipeline handles ISO generation, vMedia boot, installation monitoring, and post-install bootstrap (IDMS + ArgoCD).
- 03-intersight-configuration.md completed
- Server profile deployed and associated
- MAC addresses captured in
cluster-macs.yaml - Images mirrored to local registry (via
sync-imagesin saif-sys-admin) - DNS records configured and verified
┌─────────────────────────────────────────────────────────────────┐
│ OpenShift Pipeline Flow (openshift-pipeline.yaml) │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. Validate 2. Generate ISO 3. Upload & Boot │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ DNS, Registry│──►│ install-config│──►│ File Server │ │
│ │ Connectivity │ │ + Cilium Day1│ │ + vMedia │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ 4. Installation 5. Post-Install 6. Day 2 Handoff │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ OCP installs │──►│ IDMS Bootstrap│──►│ ArgoCD │ │
│ │ Cilium CNI │ │ ArgoCD Boot │ │ Syncs All │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
gh workflow run openshift-pipeline.yaml \
-f cluster_name=ai-pod-1 \
-f cni_type=CiliumPipeline stages:
- Validate - Check DNS, registry, UCS profile state
- Deploy - Generate ISO with Cilium manifests, upload, boot server, monitor installation
- Post-install - Apply bootstrap IDMS, install ArgoCD operator, deploy App-of-Apps
- Test - Verify cluster health, operator status
| Input | Options | Default | Purpose |
|---|---|---|---|
cluster_name |
ai-pod-1/2/3/4 | Required | Target cluster |
cni_type |
Cilium, OVN | Cilium | CNI type (Cilium recommended) |
validate |
true/false | true | Run validation stage |
deploy |
true/false | true | Run deployment stage |
post_install |
true/false | true | Run post-install (IDMS, ArgoCD) |
test |
true/false | true | Run tests |
force_deploy |
true/false | false | Bypass operational cluster safeguard |
For existing clusters needing ArgoCD re-bootstrap:
gh workflow run openshift-pipeline.yaml \
-f cluster_name=ai-pod-1 \
-f validate=false \
-f deploy=false \
-f post_install=true \
-f test=true- Verifies DNS records (api, api-int, *.apps)
- Tests registry connectivity
- Checks UCS profile state (must be Associated)
- Validates cluster-mappings.yaml and cluster-macs.yaml
- Render configs -
render-cluster-config.pygenerates install-config.yaml and agent-config.yaml - Generate ISO -
openshift-install agent create imagewith Cilium Day 1 manifests - Upload ISO - WebDAV PUT to file server
- Configure vMedia - Update UCS vMedia policy via isctl
- Power cycle - Boot server from ISO
- Monitor - Wait for installation complete (~60-90 min)
- Retrieve kubeconfig - SSH to node, get recovery kubeconfig
- Apply bootstrap IDMS - Minimal mirrors for ArgoCD installation
- Install ArgoCD - GitOps operator subscription
- Deploy App-of-Apps - Points to saif-gitops
- Verify all cluster operators healthy
- Check node Ready status
- Validate Cilium pods running
- Test API connectivity
# Dry run (view output)
python scripts/render-cluster-config.py ai-pod-1 --dry-run
# Render actual configs
python scripts/render-cluster-config.py ai-pod-1 --ciliumGenerated files in openshift/data/ai-pod-1/:
install-config.yaml- OpenShift installation configurationagent-config.yaml- Agent bootstrap configuration
metadata:
name: ai-pod-1
baseDomain: example.com
networking:
networkType: Cilium # Day 1 Cilium CNI
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
serviceNetwork:
- 172.30.0.0/16
imageContentSources:
- mirrors:
- registry.example.com:5000/openshift/release-images
source: quay.io/openshift-release-dev/ocp-release# Via workflow logs
gh run watch
# Or SSH to runner and monitor directly
ssh ubuntu@10.0.0.10
cd /data/runner-*/saif-ai-pod/saif-ai-pod/workdir-*/
# Watch bootstrap
openshift-install agent wait-for bootstrap-complete --dir=. --log-level=info
# Watch installation complete
openshift-install agent wait-for install-complete --dir=. --log-level=info| Phase | Duration |
|---|---|
| ISO generation | ~1 minute (with credential stripping) |
| Bootstrap | ~30 minutes |
| Installation | ~45-60 minutes |
| Post-install | ~10 minutes |
| Total | ~90 minutes |
The test stage verifies:
- All 36 cluster operators healthy (insights may be degraded in air-gap)
- Node shows Ready status
- Cilium pods running in
ciliumnamespace - ArgoCD deployed and syncing
# Get kubeconfig (MCP tool recommended)
cluster_status_connect(cluster_name="ai-pod-1")
export KUBECONFIG=<returned_path>
# Check nodes
oc get nodes
# Expected: cluster-1.example.com Ready master
# Check cluster operators
oc get co
# Expected: All operators Available=True
# Check Cilium
oc get pods -n cilium
# Expected: cilium-xxxxx Running on each node
# Check ArgoCD
oc get applications -n openshift-gitops
# Expected: cluster-apps Synced HealthyCause: Pull secret contains quay.io credentials, causing direct registry access.
Solution: Pipeline automatically strips quay.io credentials. If running manually:
# Strip credentials before ISO generation
STRIPPED_PULL_SECRET=$(echo "$REDHAT_PULL_SECRET" | python3 -c '
import sys, json
ps = json.load(sys.stdin)
for registry in ["quay.io", "registry.redhat.io", "registry.connect.redhat.com"]:
ps.get("auths", {}).pop(registry, None)
print(json.dumps(ps))
')- Check agent console via KVM in Intersight
- Verify DNS resolves from server network
- Verify registry accessible from server
- Check network configuration in agent-config.yaml
- Check cluster operator status
- Review
.openshift_install.log - SSH to server and check journalctl:
ssh core@10.0.1.101 sudo journalctl -f
CRITICAL: Server CIMC may cache ISO content. If regenerating ISO:
- Full power cycle server (not just reboot)
- Or update vMedia policy to different filename
- Check ArgoCD pods:
oc get pods -n openshift-gitops - Verify GitHub credentials secret exists
- Check Application status:
oc describe application cluster-apps -n openshift-gitops
After successful deployment:
- ArgoCD syncs automatically - All Day 2 components deploy via GitOps
- GPU Operator - Installs driver, enables nvidia.com/gpu resources
- Tetragon - Deploys TracingPolicies for security observability
- Hubble Timescape - Flow storage and visualization
- Splunk integration - OTEL collector for metrics
No manual Day 2 steps required. ArgoCD manages everything.
# Force redeploy (bypasses operational cluster check)
gh workflow run openshift-pipeline.yaml \
-f cluster_name=ai-pod-1 \
-f cni_type=Cilium \
-f force_deploy=truegh workflow run openshift-undeploy.yaml \
-f cluster_name=ai-pod-1 \
-f confirm_destroy=ai-pod-1 \
-f clean_kubeconfig=trueAfter deployment completes:
- Verify ArgoCD UI accessible at
https://openshift-gitops-server-openshift-gitops.apps.cluster-1.example.com - Monitor Day 2 component deployment in ArgoCD
- Wait for GPU Operator ClusterPolicy to reach "ready" state (~15 min)
- Verify Hubble Timescape collecting flows
For Day 2 operations, see saif-gitops.