This section is for in-depth guides and tutorials related to Holodeck.
- OS Selection Guide: Learn how to use the simplified OS field for automatic AMI resolution, supported operating systems, and multi-architecture support.
- Source Selection Guide: Choose the right installation source for each component - packages, git, latest branches, or runfiles.
- NVIDIA Driver Sources: Configure NVIDIA driver installation from packages, runfile installers, or building from source.
- Container Runtime Sources: Configure container runtime (containerd, Docker, CRI-O) installation from packages, git, or latest.
- CTK Installation Sources: Configure NVIDIA Container Toolkit installation from packages, git, or tracking latest branches.
- IP Detection Guide: Learn about automatic IP detection for AWS environments, including configuration, troubleshooting, and best practices.
- Multi-Node Clusters: Deploy Kubernetes clusters with multiple control-plane and worker nodes, including HA configuration.
- RPM Distribution Guide: Setup and configuration for Rocky Linux 9 and Amazon Linux 2023, including RPM-specific considerations.
- Custom Templates: Run user-provided scripts at specific provisioning phases with inline, file, or URL sources.
- GitHub Action: Use holodeck as a GitHub Action to provision GPU test environments in CI workflows.
- AICR Integration: Preview walkthrough that provisions a single-node L40S cluster with Holodeck and captures an AICR snapshot + recipe. The Slurm and Dynamo deploy paths are pending upstream changes (see the guide's "What's coming" section).
To contribute a guide, simply add a new Markdown file to this folder and update this README with a link.