A tool for creating and managing GPU-ready Cloud test environments.
- Multi-OS Support: Ubuntu, Rocky Linux 9, Amazon Linux 2023 with automatic AMI resolution (guide)
- Multi-Architecture: x86_64 and ARM64 with automatic architecture inference
- Custom Templates: Run user-provided scripts at any provisioning phase (guide)
- Multi-Node Clusters: HA Kubernetes clusters with kubeadm (guide)
- Flexible Sources: Install components from packages, git, runfiles, or latest branches (guide)
- Automatic IP Detection: No manual IP configuration needed for AWS (guide)
See docs/quick-start.md for a full walkthrough.
brew tap nvidia/holodeck https://github.com/NVIDIA/holodeckmacOS (arm64 / amd64) β installs via Cask:
brew install --cask nvidia/holodeck/holodeck
holodeck --helpLinux (arm64 / amd64) β installs via Formula:
brew install --formula nvidia/holodeck/holodeck
holodeck --helpThe --cask / --formula flags are required to disambiguate: the tap
ships both an holodeck Cask (macOS-only) and an holodeck Formula
(Linux-only, declares depends_on :linux), and brew defaults to the
Formula when both exist by the same name. Pre-built binaries for both
platforms are downloaded from the GitHub Releases
page. Run
brew upgrade --cask nvidia/holodeck/holodeck (macOS) or
brew upgrade --formula nvidia/holodeck/holodeck (Linux) to update.
Why the platform split? macOS uses a Cask because brew's Formula build-sandbox triggers a
PTY.openfailure on macOS Tahoe (26.x) + brew 5.1.x + portable-ruby 4.0.x. Casks skip that sandbox path, sobrew install --caskworks cleanly. Until Apple Developer code signing and notarization are wired up, the Cask'spostflighthook removes thecom.apple.quarantinexattr so Gatekeeper doesn't block the unsigned binary.
Platform Mechanism File macOS (arm64 / amd64) Homebrew Cask Casks/holodeck.rbLinux (arm64 / amd64) Homebrew Formula Formula/holodeck.rb
make build
sudo mv ./bin/holodeck /usr/local/bin/holodeck
holodeck --help- Go 1.20+
- (For AWS) Valid AWS credentials in your environment
- (For SSH) Reachable host and valid SSH key
See docs/prerequisites.md for details.
When installing NVIDIA drivers, Holodeck requires kernel headers matching your running kernel version. If exact headers are unavailable, Holodeck will attempt to find compatible ones, though this may cause driver compilation issues.
For kernel compatibility details and troubleshooting, see Kernel Compatibility in the prerequisites documentation.
See docs/contributing/ for full details.
make buildβ Build the holodeck binarymake testβ Run all testsmake lintβ Run lintersmake cleanβ Remove build artifacts
See docs/commands/ for detailed command documentation and examples.
holodeck --helpholodeck create -f ./examples/v1alpha1_environment.yamlholodeck listholodeck delete <instance-id>holodeck cleanup vpc-12345678holodeck status <instance-id>holodeck dryrun -f ./examples/v1alpha1_environment.yamlBy default, the kubeconfig holodeck produces is configured for in-VM
use: file mode 0600 owned by the holodeck process user, and the
server URL points at the cluster's internal IP. To run kubectl from
outside the VPC (e.g., a GitHub Actions runner that provisioned the
cluster), set kubernetes.remoteAccess: true:
spec:
kubernetes:
install: true
installer: kubeadm
remoteAccess: trueWhat changes when this is true:
- The kubeconfig server URL is rewritten to
https://<PublicDnsName>:6443. - The kubeconfig file is chowned to the bind-mounted workspace owner
so the runner user (not just the action container's root) can read
it. File mode stays
0600.
What does not change:
- The security group still opens
6443only to the auto-detected caller egress IP (utils.GetIPAddress()), not0.0.0.0/0. - The embedded cluster admin cert is owner-only.
Platform: Linux/Darwin. On Windows, the chown step is a no-op.
For downstream CI repos (gpu-operator, k8s-device-plugin): set
remoteAccess: true in your holodeck.yaml and replace any
rsync + ssh + remote-run blocks in your workflow with a direct
kubectl --kubeconfig=$GITHUB_WORKSPACE/kubeconfig β¦ step.
Holodeck ships an embedded catalog of agentic skills that teach an AI coding agent how to drive the CLI correctly. List the catalog:
holodeck skill listInstall a skill into your AI agent's native format:
# Claude Code (project-local: ./.claude/skills/<name>/SKILL.md)
holodeck skill add --claude using-holodeck
# Multiple agents at once
holodeck skill add --claude --cursor --codex --gemini using-holodeck
# Or install everything for every agent, user-wide
holodeck skill add --all --all-agents --globalSupported agents: Claude Code, Cursor, Codex CLI, Gemini CLI. Skills are short markdown guides authored against the actual CLI behavior; they version with the code so updates land alongside the features they describe.
For more information, see the documentation directory.