A Kubernetes operator that abstracts HyperShift complexity for NVIDIA DPF (DOCA Platform Framework). It orchestrates the full lifecycle of HostedClusters for DPU (Data Processing Unit) environments, treating the hosted control plane as a "black box" for end users.
The operator maintains a 1:1:1 relationship: each DPFHCPProvisioner CR maps to exactly one DPUCluster and one HostedCluster.
- HostedCluster lifecycle management -- creates, updates, and deletes HostedClusters, NodePools, and associated secrets
- Automatic CSR approval -- approves Certificate Signing Requests from DPU worker nodes joining the hosted cluster
- BlueField OCP layer image lookup -- matches OCP release images to corresponding BlueField container images via container registry tag lookup (skipped when
machineOSURLis provided) - Kubeconfig injection -- extracts the HostedCluster kubeconfig and injects it into the DPUCluster CR
- MetalLB configuration -- deploys IPAddressPool and L2Advertisement resources for LoadBalancer service exposure
- Ignition generation -- generates BlueField-specific ignition configurations from HyperShift ignition for DPU node provisioning
- Status translation -- mirrors HostedCluster conditions to DPFHCPProvisioner status without exposing HyperShift internals
| Phase | Description |
|---|---|
Pending |
Initial state; validation of DPUCluster, secrets, and configuration in progress |
Provisioning |
HostedCluster and related resources are being created |
IgnitionGenerating |
BlueField-specific ignition configuration is being generated |
Ready |
HostedCluster is operational, kubeconfig injected, CSR auto-approval active |
Failed |
Permanent failure requiring user intervention |
Deleting |
Finalizer cleanup in progress (HostedCluster deletion, MetalLB cleanup) |
- DPFHCPProvisionerReconciler -- main controller that orchestrates the full provisioning lifecycle through modular feature handlers
- CSRApprovalReconciler -- secondary controller that watches for and auto-approves CSRs from DPU worker nodes in hosted clusters
The following operators and components must be installed on the management cluster:
| Component | Description |
|---|---|
| OpenShift | Management cluster platform |
| MCE Operator | Multi-Cluster Engine for HyperShift support |
| HyperShift Operator | Manages HostedCluster resources |
| DPF Operator | Manages DPUCluster and DPU resources |
| ODF Operator | Provides storage classes for etcd persistent volumes |
| MetalLB Operator | Provides LoadBalancer services for control plane exposure |
API Group: provisioning.dpu.hcp.io/v1alpha1
Scope: Namespaced
Short Name: dpfhcp
| Field | Type | Required | Immutable | Default | Description |
|---|---|---|---|---|---|
dpuClusterRef |
{name, namespace} |
Yes | Yes | -- | Cross-namespace reference to the target DPUCluster |
baseDomain |
string |
Yes | Yes | -- | Base domain for hosted cluster DNS (e.g., clusters.example.com) |
ocpReleaseImage |
string |
Yes | No | -- | Full pull-spec URL for the OCP release image |
sshKeySecretRef |
{name} |
Yes | Yes | -- | Secret containing SSH public key (id_rsa.pub key) |
pullSecretRef |
{name} |
Yes | Yes | -- | Secret containing container registry pull secret (.dockerconfigjson key) |
etcdStorageClass |
string |
No | Yes | -- | Storage class for etcd persistent volumes |
controlPlaneAvailabilityPolicy |
string |
No | Yes | HighlyAvailable |
SingleReplica or HighlyAvailable |
virtualIP |
string |
No | Yes | -- | Virtual IP for LoadBalancer (required when HighlyAvailable) |
nodeSelector |
map[string]string |
No | Yes | -- | Node selector for control plane pods (max 20 entries) |
flannelEnabled |
bool |
No | Yes | true |
Enable Flannel as the CNI plugin |
dpuDeploymentRef |
{name, namespace} |
Yes | Yes | -- | Cross-namespace reference to DPUDeployment for ignition generation |
machineOSURL |
string |
No | No | -- | Custom OS image URL for ignition configuration |
phase-- current lifecycle phase (see DPFHCPProvisioner Lifecycle Phases)conditions-- array of conditions tracking operator state (Ready, HostedClusterAvailable, KubeConfigInjected, CSRAutoApprovalActive, SecretsValid, BlueFieldOCPLayerImageFound, MetalLBConfigured, IgnitionConfigured, etc.)hostedClusterRef-- reference to the created HostedClusterkubeConfigSecretRef-- reference to the injected kubeconfig SecretblueFieldOCPLayerImage-- matched BlueField OCP layer image URL
apiVersion: provisioning.dpu.hcp.io/v1alpha1
kind: DPFHCPProvisioner
metadata:
name: dpfhcpprovisioner-sample
namespace: dpf-hcp-provisioner-system
spec:
dpuClusterRef:
name: dpu-cluster-sample
namespace: dpf-operator-system
baseDomain: clusters.example.com
ocpReleaseImage: quay.io/openshift-release-dev/ocp-release:4.19.0-ec.5-multi
sshKeySecretRef:
name: prod-ssh-key
pullSecretRef:
name: prod-pull-secret
etcdStorageClass: ceph-rbd-retain
dpuDeploymentRef:
name: dpu-deployment-sample
namespace: dpf-operator-system
controlPlaneAvailabilityPolicy: HighlyAvailable
virtualIP: 192.168.1.100API Group: provisioning.dpu.hcp.io/v1alpha1
Scope: Cluster
Short Name: dpfhcpconfig
A cluster-scoped singleton (must be named default) that provides operator-wide configuration.
| Field | Type | Default | Description |
|---|---|---|---|
blueFieldOCPLayerRepo |
string |
quay.io/eelgaev/rhcos-bfb |
Container registry for BlueField OCP layer images |
apiVersion: provisioning.dpu.hcp.io/v1alpha1
kind: DPFHCPProvisionerConfig
metadata:
name: default
spec:
blueFieldOCPLayerRepo: quay.io/eelgaev/rhcos-bfbhelm install dpf-hcp-provisioner ./helm/dpf-hcp-provisioner-operator \
--namespace dpf-hcp-provisioner-system \
--create-namespaceFor the full configuration reference, see docs/configuration.md.
kubectl create namespace dpf-hcp-provisioner-systemkubectl create secret generic prod-ssh-key \
--namespace dpf-hcp-provisioner-system \
--from-file=id_rsa.pub=$HOME/.ssh/id_rsa.pubkubectl create secret generic prod-pull-secret \
--namespace dpf-hcp-provisioner-system \
--from-file=.dockerconfigjson=$HOME/.docker/config.json \
--type=kubernetes.io/dockerconfigjsonkubectl apply -f config/samples/provisioning_v1alpha1_dpfhcpprovisioner.yaml# Watch the phase transitions
kubectl get dpfhcp -n dpf-hcp-provisioner-system -w
# Inspect detailed conditions
kubectl describe dpfhcp dpfhcpprovisioner-sample -n dpf-hcp-provisioner-systemOnce the phase reaches Ready, the kubeconfig is injected into the DPUCluster CR:
kubectl get secret -n <dpucluster-namespace> -l app.kubernetes.io/managed-by=dpf-hcp-provisionerCheck operator logs and CR conditions to diagnose issues:
# Operator logs
kubectl logs -n dpf-hcp-provisioner-system deployment/dpf-hcp-provisioner-operator -f
# CR conditions
kubectl describe dpfhcp <name> -n dpf-hcp-provisioner-systemFor a full guide on common issues and debugging, see docs/troubleshooting.md.
# Build the operator binary
make build
# Run unit tests
make test
# Run linter
make lint
# Build container image
make docker-build IMG=<registry/image:tag>
# Push container image
make docker-push IMG=<registry/image:tag>
# Generate CRDs and RBAC manifests
make manifests
# Generate DeepCopy methods
make generate
# Package and push Helm chart
make helm-package
make helm-pushFor developers looking to deploy, run, and test the operator in a full end-to-end environment, the openshift-dpf repository provides automation for setting up the complete DPF stack, including all prerequisites and the DPF HCP Provisioner Operator.
Copyright 2025. Licensed under the Apache License, Version 2.0.
