|
| 1 | +# OpenDataHub Installation and Configuration |
| 2 | + |
| 3 | +This directory contains the configuration files for installing OpenDataHub (ODH) with KServe support for the MaaS platform. |
| 4 | + |
| 5 | +## Key Features |
| 6 | + |
| 7 | +- **RawDeployment Mode**: Direct pod deployments without Knative/Serverless overhead |
| 8 | +- **NVIDIA NIM Support**: GPU-accelerated inference with NVIDIA Inference Microservices |
| 9 | +- **Headless Services**: Direct pod-to-pod communication for low latency |
| 10 | +- **OpenShift Integration**: Uses OpenShift's default ingress and certificates |
| 11 | + |
| 12 | +## Installation Methods |
| 13 | + |
| 14 | +### Method 1: Automated Installation (Recommended) |
| 15 | + |
| 16 | +Use the provided installation script which handles all steps in the correct order: |
| 17 | + |
| 18 | +```bash |
| 19 | +# From the project root |
| 20 | +./deployment/scripts/installers/install-odh.sh |
| 21 | +``` |
| 22 | + |
| 23 | +### Method 2: Manual Installation |
| 24 | + |
| 25 | +1. **Install ODH Operator** from OperatorHub: |
| 26 | + ```bash |
| 27 | + # Via OpenShift Console: |
| 28 | + # 1. Navigate to Operators → OperatorHub |
| 29 | + # 2. Search for "OpenDataHub" |
| 30 | + # 3. Install with default settings |
| 31 | + |
| 32 | + # Or via CLI: |
| 33 | + oc create -f - <<EOF |
| 34 | + apiVersion: operators.coreos.com/v1alpha1 |
| 35 | + kind: Subscription |
| 36 | + metadata: |
| 37 | + name: opendatahub-operator |
| 38 | + namespace: openshift-operators |
| 39 | + spec: |
| 40 | + channel: fast |
| 41 | + name: opendatahub-operator |
| 42 | + source: community-operators |
| 43 | + sourceNamespace: openshift-marketplace |
| 44 | + EOF |
| 45 | + ``` |
| 46 | +
|
| 47 | +2. **Create namespace**: |
| 48 | + ```bash |
| 49 | + kubectl create namespace opendatahub |
| 50 | + ``` |
| 51 | +
|
| 52 | +3. **Wait for CRDs to be registered**: |
| 53 | + ```bash |
| 54 | + # Wait for the operator to create the CRDs |
| 55 | + kubectl wait --for condition=established --timeout=300s \ |
| 56 | + crd/dscinitializations.dscinitialization.opendatahub.io \ |
| 57 | + crd/datascienceclusters.datasciencecluster.opendatahub.io |
| 58 | + ``` |
| 59 | +
|
| 60 | +4. **Apply the configuration**: |
| 61 | + ```bash |
| 62 | + # IMPORTANT: DSCInitialization MUST be created before DataScienceCluster |
| 63 | + kubectl apply -f dscinitialisation.yaml |
| 64 | + |
| 65 | + # Wait for DSCInitialization to be ready |
| 66 | + kubectl wait --for=jsonpath='{.status.phase}'=Ready \ |
| 67 | + dscinitializations.dscinitialization.opendatahub.io/default-dsci \ |
| 68 | + -n opendatahub --timeout=300s |
| 69 | + |
| 70 | + # Now create the DataScienceCluster |
| 71 | + kubectl apply -f datasciencecluster.yaml |
| 72 | + ``` |
| 73 | +
|
| 74 | + Or use kustomize: |
| 75 | + ```bash |
| 76 | + kubectl apply -k deployment/components/odh/ |
| 77 | + ``` |
| 78 | +
|
| 79 | +## Troubleshooting |
| 80 | +
|
| 81 | +### Error: "dscinitializations.dscinitialization.opendatahub.io not found" |
| 82 | +
|
| 83 | +This is the most common error when creating a DataScienceCluster. It occurs when: |
| 84 | +1. The ODH operator is not installed |
| 85 | +2. The DSCInitialization resource hasn't been created yet |
| 86 | +3. The CRDs haven't been registered yet |
| 87 | +
|
| 88 | +**Solution**: Run the fix script: |
| 89 | +```bash |
| 90 | +./deployment/scripts/installers/fix-odh-dsci.sh |
| 91 | +``` |
| 92 | +
|
| 93 | +This script will: |
| 94 | +- Check if the ODH operator is installed |
| 95 | +- Wait for CRDs to be registered |
| 96 | +- Create the DSCInitialization if missing |
| 97 | +- Provide next steps for creating the DataScienceCluster |
| 98 | +
|
| 99 | +### Manual Troubleshooting Steps |
| 100 | +
|
| 101 | +1. **Check operator status**: |
| 102 | + ```bash |
| 103 | + kubectl get csv -n openshift-operators | grep opendatahub |
| 104 | + kubectl logs -n openshift-operators deployment/opendatahub-operator-controller-manager |
| 105 | + ``` |
| 106 | +
|
| 107 | +2. **Check CRDs**: |
| 108 | + ```bash |
| 109 | + kubectl get crd | grep opendatahub |
| 110 | + ``` |
| 111 | +
|
| 112 | +3. **Check existing resources**: |
| 113 | + ```bash |
| 114 | + kubectl get dscinitializations -A |
| 115 | + kubectl get datasciencecluster -A |
| 116 | + ``` |
| 117 | +
|
| 118 | +4. **Check pod status**: |
| 119 | + ```bash |
| 120 | + kubectl get pods -n opendatahub |
| 121 | + kubectl get pods -n kserve |
| 122 | + ``` |
| 123 | +
|
| 124 | +## Configuration Details |
| 125 | +
|
| 126 | +### DSCInitialization |
| 127 | +- Configures the foundational settings for ODH |
| 128 | +- Sets up Service Mesh integration |
| 129 | +- Configures monitoring and trusted CA bundles |
| 130 | +- **MUST be created before DataScienceCluster** |
| 131 | +
|
| 132 | +### DataScienceCluster |
| 133 | +- Deploys the actual ODH components |
| 134 | +- Configured for KServe with: |
| 135 | + - **RawDeployment mode**: No Knative/Serverless overhead |
| 136 | + - **NIM support**: For NVIDIA GPU inference |
| 137 | + - **Headless services**: For direct pod communication |
| 138 | + - **OpenShift ingress**: Native OpenShift routing |
| 139 | +
|
| 140 | +### Components Status |
| 141 | +- ✅ **Enabled**: Dashboard, Workbenches, KServe (with NIM) |
| 142 | +- ❌ **Disabled**: ModelMesh, Pipelines, Ray, Kueue, Model Registry, TrustyAI, Training Operator |
| 143 | +
|
| 144 | +## Verification |
| 145 | +
|
| 146 | +After installation, verify the deployment: |
| 147 | +
|
| 148 | +```bash |
| 149 | +# Check DSCInitialization status |
| 150 | +kubectl get dscinitializations -n opendatahub |
| 151 | +
|
| 152 | +# Check DataScienceCluster status |
| 153 | +kubectl get datasciencecluster -n opendatahub |
| 154 | +
|
| 155 | +# Check KServe components |
| 156 | +kubectl get pods -n kserve |
| 157 | +
|
| 158 | +# Check if InferenceService CRD is available |
| 159 | +kubectl get crd inferenceservices.serving.kserve.io |
| 160 | +``` |
| 161 | +
|
| 162 | +## Integration with MaaS |
| 163 | +
|
| 164 | +Once ODH is installed with KServe, you can: |
| 165 | +1. Deploy models using KServe InferenceService |
| 166 | +2. Use the MaaS API for model management |
| 167 | +3. Apply rate limiting and authentication policies |
| 168 | +4. Monitor model performance through the ODH dashboard |
| 169 | +
|
| 170 | +## Additional Resources |
| 171 | +
|
| 172 | +- [OpenDataHub Documentation](https://opendatahub.io/docs/) |
| 173 | +- [KServe Documentation](https://kserve.github.io/website/) |
| 174 | +- [NVIDIA NIM Documentation](https://docs.nvidia.com/nim/) |
0 commit comments