Bundler Development Guide

Learn how to add new components to AICR.

Overview

The bundler system converts RecipeInput objects into deployment artifacts. Artifacts include Helm values files, Kubernetes manifests, and optional custom manifests. Deployment documentation (READMEs) is generated at the deployer level, not by individual component bundlers.

Architecture:

Declarative Component Registry: Component configuration is defined in recipes/registry.yaml
No separate Go packages: Adding a new component only requires a registry entry and values files
DefaultBundler: The pkg/bundler package generates Helm per-component bundles from recipes
Recipe-driven: Components are selected based on recipe's componentRefs
Value overrides: CLI --set flag allows runtime customization via ApplyMapOverrides()
Node scheduling: Registry defines paths for injecting node selectors and tolerations
Structured errors: Uses pkg/errors for error codes and wrapping

Quick Start

Adding a New Component (Declarative Approach)

Adding a new component requires no Go code. Simply add an entry to the component registry:

Step 1: Add to recipes/registry.yaml

components:
  # ... existing components ...

  - name: my-operator
    displayName: My Operator
    valueOverrideKeys:
      - myoperator
    helm:
      defaultRepository: https://charts.example.com
      defaultChart: example/my-operator
      defaultVersion: v1.0.0
    nodeScheduling:
      system:
        nodeSelectorPaths:
          - operator.nodeSelector
        tolerationPaths:
          - operator.tolerations

Step 2: Add component values file

Create recipes/components/my-operator/values.yaml:

# My Operator Helm values
operator:
  replicas: 1
  image:
    repository: example/my-operator
    tag: v1.0.0

Step 3: Reference in recipe

Add the component to a recipe overlay in recipes/overlays/:

componentRefs:
  - name: my-operator
    type: Helm
    version: v1.0.0
    source: https://charts.example.com
    valuesFile: components/my-operator/values.yaml

That's it! The bundler system automatically:

Loads component configuration from the registry
Extracts values from the recipe's valuesFile
Applies user value overrides from CLI --set flags
Applies node selectors and tolerations to configured paths
Generates the per-component bundle with the component's values and manifests

Optional: Custom Manifests

For components that need additional Kubernetes manifests (beyond the Helm chart), add them to recipes/components/<name>/manifests/:

Step 1: Create manifest file

Create recipes/components/gpu-operator/manifests/dcgm-exporter.yaml:

# DCGM Exporter ConfigMap
{{- $gpuOp := index .Values "gpu-operator" }}
{{- if and $gpuOp $gpuOp.dcgmExporter $gpuOp.dcgmExporter.config $gpuOp.dcgmExporter.config.create }}
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ $gpuOp.dcgmExporter.config.name | default "dcgm-exporter" }}
  namespace: {{ .Release.Namespace }}
data:
  dcgm-metrics.csv: |
    # Metrics configuration
{{- end }}

Step 2: Reference in recipe

Add the manifest to the component's manifestFiles in the recipe:

componentRefs:
  - name: gpu-operator
    type: Helm
    version: v25.3.3
    manifestFiles:
      - components/gpu-operator/manifests/dcgm-exporter.yaml

The bundler automatically includes manifest files in the component's manifests/ directory.

Registry Configuration Reference

The component registry (recipes/registry.yaml) supports these fields:

Helm Component Configuration:

- name: component-name              # Required: Component identifier
  displayName: Component Name       # Required: Human-readable name
  valueOverrideKeys:               # Optional: Alternative --set prefixes
    - componentname
  helm:
    defaultRepository: https://...  # Optional: Default Helm repo URL
    defaultChart: repo/chart        # Optional: Default chart name
    defaultVersion: v1.0.0          # Optional: Default chart version
  nodeScheduling:
    system:                        # For system/control-plane components
      nodeSelectorPaths:
        - operator.nodeSelector
      tolerationPaths:
        - operator.tolerations
    accelerated:                   # For GPU workload components
      nodeSelectorPaths:
        - daemonsets.nodeSelector
      tolerationPaths:
        - daemonsets.tolerations

Kustomize Component Configuration:

- name: my-kustomize-app            # Required: Component identifier
  displayName: My Kustomize App     # Required: Human-readable name
  valueOverrideKeys:               # Optional: Alternative --set prefixes
    - mykustomize
  kustomize:
    defaultSource: https://github.com/example/my-app  # Required: Git repo or OCI reference
    defaultPath: deploy/production  # Optional: Path to kustomization
    defaultTag: v1.0.0              # Optional: Git tag, branch, or commit

Note: A component must have either helm OR kustomize configuration, not both. The system will detect the component type based on which configuration is present.

Note:

Values are written directly to values.yaml, not via templates
Deployment documentation (README) is generated at the deployer level (helm, argocd)
The pkg/component package provides helper utilities if custom bundler logic is needed

Best Practices

Adding Components

Prefer declarative configuration: Add entries to registry.yaml rather than writing Go code
Use consistent naming: component name should match the Helm chart name (e.g., gpu-operator)
Define valueOverrideKeys for user-friendly --set prefixes (e.g., gpuoperator allows --set gpuoperator:key=value)
Configure nodeScheduling paths only for components that need workload placement
Create values files under recipes/components/<name>/ for reusable configurations

Values Files

Keep base values (values.yaml) minimal and widely applicable
Create overlay values (values-<context>.yaml) for specific scenarios
Document non-obvious settings with comments
Use consistent formatting (2-space indent for YAML)
Override release name prefix: Use fullnameOverride to avoid the aicr-stack- prefix in resource names. This makes resource names cleaner and more predictable. For example, in kube-prometheus-stack/values.yaml:
```
# Override release name prefix to avoid aicr-stack- prefix in resource names
fullnameOverride: kube-prometheus
```
Without this override, resources would be named aicr-stack-kube-prometheus-* instead of kube-prometheus-*.

Custom Manifests

Only add custom manifests when the Helm chart doesn't provide needed functionality
Use Helm template syntax (not Go templates) for manifest files
Reference values via {{ index .Values "component-name" }}
Make manifests conditional with {{- if }} blocks

Testing

Run make test to validate all recipe data
Test recipe generation: aicr recipe --service eks --accelerator gb200
Test bundle generation: aicr bundle -r recipe.yaml -o ./test-bundle
Verify generated values.yaml contains expected settings

Testing in a Local Kind Cluster

Step 0: Create a local kind cluster

Create a local kind cluster for end-to-end testing.

make dev-env

This creates a kind cluster with two nodes and starts Tilt.

Step 1: Build the aicr binary

Build the CLI with embedded recipe data and install it:

make build && cp dist/aicr_darwin_all/aicr /usr/local/bin/

This compiles the Go code and embeds all files from recipes/ into the binary. The binary is copied to /usr/local/bin/ for global access.

Step 2: Generate the recipe

Generate a recipe optimized for Kind clusters:

aicr recipe --service kind -o recipe.yaml

This creates a recipe.yaml file with:

Components configured for local development (reduced resources, emptyDir storage)
GPU operator with driver installation disabled (uses host drivers via passthrough)
cert-manager with extended startupapicheck timeout
nvsentinel with network policy disabled

Step 3: Generate the Helm bundle

Convert the recipe into a Helm per-component bundle:

aicr bundle --recipe recipe.yaml --output bundle

This generates a bundle/ directory containing:

deploy.sh - One-command deployment script
README.md - Deployment guide with ordered steps
recipe.yaml - Copy of input recipe
<component>/values.yaml - Component-specific Helm values
<component>/README.md - Per-component install/upgrade/uninstall
<component>/manifests/ - Additional manifests (if any)

Step 4: Deploy the bundle

Run the deployment script:

cd bundle
chmod +x deploy.sh && ./deploy.sh

Step 6: Verify the deployment

Check that all pods are running:

kubectl get pods -n aicr-stack

All pods should show Running or Completed status. Common issues:

Pending pods: Check for resource constraints with kubectl describe pod <name> -n aicr-stack
CrashLoopBackOff: Check logs with kubectl logs <pod-name> -n aicr-stack
ImagePullBackOff: Verify network connectivity and image registry access

Cleanup:

To remove the deployment:

helm uninstall aicr-stack -n aicr-stack
kubectl delete namespace aicr-stack

Note: Some cluster-scoped resources (CRDs, ClusterRoles, Webhooks) may need manual cleanup:

# Delete leftover webhooks
kubectl delete mutatingwebhookconfiguration,validatingwebhookconfiguration -l app.kubernetes.io/instance=aicr-stack

# Delete leftover cluster roles
kubectl delete clusterrole,clusterrolebinding -l app.kubernetes.io/instance=aicr-stack

Documentation

Update recipes/README.md when adding new components
Document component-specific settings in values file comments
Add examples to examples/ directory for common use cases

Common Patterns

Component Registry Structure

Components are configured in recipes/registry.yaml. Here's an example entry:

- name: gpu-operator
  displayName: GPU Operator
  valueOverrideKeys:
    - gpuoperator
  helm:
    defaultRepository: https://helm.ngc.nvidia.com/nvidia
    defaultChart: nvidia/gpu-operator
    defaultVersion: v25.3.3
  nodeScheduling:
    system:
      nodeSelectorPaths:
        - operator.nodeSelector
        - node-feature-discovery.gc.nodeSelector
        - node-feature-discovery.master.nodeSelector
      tolerationPaths:
        - operator.tolerations
        - node-feature-discovery.gc.tolerations
        - node-feature-discovery.master.tolerations
    accelerated:
      nodeSelectorPaths:
        - daemonsets.nodeSelector
        - node-feature-discovery.worker.nodeSelector
      tolerationPaths:
        - daemonsets.tolerations
        - node-feature-discovery.worker.tolerations

Node Scheduling Configuration

The bundle command supports --system-node-selector, --system-node-toleration, --accelerated-node-selector, --accelerated-node-toleration, and --nodes flags.

How it works:

Paths are defined in registry.yaml under nodeScheduling (e.g. nodeSelectorPaths, tolerationPaths, nodeCountPaths)
The bundler automatically applies CLI flag values to those paths
Values are written to the component's values.yaml in its per-component bundle directory

The --nodes flag (bundle-time only) sets the estimated GPU node count. Components that declare nodeCountPaths in the registry receive this value at those paths in their generated Helm values.

Example CLI usage:

aicr bundle -r recipe.yaml \
  --system-node-selector nodeGroup=system-pool \
  --accelerated-node-selector nvidia.com/gpu.present=true \
  -o ./bundles

Value Overrides

Override component values at bundle generation time:

# Override GPU Operator driver version
aicr bundle -r recipe.yaml --set gpuoperator:driver.version=580.82.07 -o ./bundles

# Multiple overrides
aicr bundle -r recipe.yaml \
  --set gpuoperator:driver.version=580.82.07 \
  --set gpuoperator:gds.enabled=true \
  -o ./bundles

The prefix before : matches the component's valueOverrideKeys in the registry.

Deployer Integration

After bundlers generate deployment artifacts, deployers transform them into deployment-specific formats. The deployer framework is separate from bundlers but works with their output.

How Bundlers and Deployers Work Together

flowchart LR
    R[RecipeResult] --> B[Bundlers]
    B --> A[Artifacts]
    A --> D[Deployers]
    D --> O[Deployment Output]

    subgraph "Bundler Output"
        A1[values.yaml]
        A2[manifests/]
        A3[checksums.txt]
    end

    subgraph "Deployer Output"
        O1[ArgoCD Applications]
        O2[Helm Charts]
        O3[README.md]
    end

Deployment Order

Deployers respect the deploymentOrder field from the recipe to ensure components are deployed in the correct sequence:

Deployer	Ordering Mechanism
`helm`	Components listed in order in README
`argocd`	`sync-wave` annotations (0, 1, 2...)

Example Recipe with Deployment Order:

componentRefs:
  - name: cert-manager
    version: v1.17.2
  - name: gpu-operator
    version: v25.3.3
  - name: network-operator
    version: v25.4.0
deploymentOrder:
  - cert-manager
  - gpu-operator
  - network-operator

Bundler Output for Deployers

When the --deployer flag is set, bundlers generate standard artifacts that deployers then transform:

For Helm (--deployer helm, default):

Generates per-component bundle directories with individual values.yaml
Creates component-specific values with installation scripts
Includes manifests and deployment instructions per component

For ArgoCD (--deployer argocd):

Bundler generates values.yaml and manifests/
Deployer creates <component>/argocd/application.yaml with sync-wave annotations
Deployer creates app-of-apps.yaml at bundle root
Applications use multi-source to reference values.yaml and manifests from GitOps repo

Using Deployers with Bundlers

The deployer is specified at bundle generation time:

# Default: Helm per-component bundle
aicr bundle -r recipe.yaml -o ./bundles

# Generate bundles with ArgoCD deployer (use --repo to set Git repository URL)
aicr bundle -r recipe.yaml -o ./bundles --deployer argocd \
  --repo https://github.com/my-org/my-gitops-repo.git

See CLI Architecture for detailed deployer documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bundler Development Guide

Overview

Quick Start

Adding a New Component (Declarative Approach)

Optional: Custom Manifests

Registry Configuration Reference

Best Practices

Adding Components

Values Files

Custom Manifests

Testing

Testing in a Local Kind Cluster

Documentation

Common Patterns

Component Registry Structure

Node Scheduling Configuration

Value Overrides

Deployer Integration

How Bundlers and Deployers Work Together

Deployment Order

Bundler Output for Deployers

Using Deployers with Bundlers

See Also

FilesExpand file tree

component.md

Latest commit

History

component.md

File metadata and controls

Bundler Development Guide

Overview

Quick Start

Adding a New Component (Declarative Approach)

Optional: Custom Manifests

Registry Configuration Reference

Best Practices

Adding Components

Values Files

Custom Manifests

Testing

Testing in a Local Kind Cluster

Documentation

Common Patterns

Component Registry Structure

Node Scheduling Configuration

Value Overrides

Deployer Integration

How Bundlers and Deployers Work Together

Deployment Order

Bundler Output for Deployers

Using Deployers with Bundlers

See Also