Skip to content

Commit f820d4b

Browse files
Trying to move VLLMInferenceService to deployment/ not everything works, but simulator should work
1 parent b6d21e5 commit f820d4b

27 files changed

+1203
-420
lines changed

.gitignore

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,5 +37,4 @@ pip-delete-this-directory.txt
3737
htmlcov/
3838
apps/frontend/.env.local
3939
apps/backend/.env
40-
CLAUDE.md
41-
temp/
40+
CLAUDE.md

deployment/README.md

Lines changed: 24 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,9 @@ deployment/
3131
│ └── kubernetes/ # Standard Kubernetes deployment (experimental)
3232
├── samples/ # Example model deployments
3333
│ └── models/
34-
│ ├── simulator/ # CPU-based test model
34+
│ ├── rbac/ # Shared RBAC for LLMInferenceService
35+
│ ├── simulator/ # CPU-based test model (llm-d-inference-sim)
36+
│ ├── facebook-opt-125m-cpu/ # CPU-based OPT-125M model
3537
│ └── qwen3/ # GPU-based Qwen3 model
3638
└── scripts/ # Installation utilities
3739
```
@@ -103,9 +105,17 @@ kustomize build deployment/overlays/kubernetes | envsubst | kubectl apply -f -
103105

104106
### Step 4: Deploy Sample Models (Optional)
105107

108+
> [!NOTE]
109+
> These models use KServe's `LLMInferenceService` custom resource, which requires ODH/RHOAI with KServe enabled.
110+
106111
#### Simulator Model (CPU)
107112
```bash
108-
kustomize build deployment/samples/models/simulator | kubectl apply -f -
113+
kubectl apply -k deployment/samples/models/simulator/
114+
```
115+
116+
#### Facebook OPT-125M Model (CPU)
117+
```bash
118+
kubectl apply -k deployment/samples/models/facebook-opt-125m-cpu/
109119
```
110120

111121
#### Qwen3 Model (GPU Required)
@@ -114,7 +124,16 @@ kustomize build deployment/samples/models/simulator | kubectl apply -f -
114124
> This model requires GPU nodes with `nvidia.com/gpu` resources available in your cluster.
115125
116126
```bash
117-
kustomize build deployment/samples/models/qwen3 | kubectl apply -f -
127+
kubectl apply -k deployment/samples/models/qwen3/
128+
```
129+
130+
#### Verify Model Deployment
131+
```bash
132+
# Check LLMInferenceService status
133+
kubectl get llminferenceservices -n llm
134+
135+
# Check pods
136+
kubectl get pods -n llm
118137
```
119138

120139
## Platform-Specific Configuration
@@ -327,8 +346,8 @@ Check that policies are enforced:
327346
kubectl get authpolicy -A
328347
kubectl get tokenratelimitpolicy -A
329348

330-
# Check InferenceServices are ready
331-
kubectl get inferenceservice -n llm
349+
# Check LLMInferenceServices are ready
350+
kubectl get llminferenceservices -n llm
332351
```
333352

334353
## Services Exposed
Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
# OpenDataHub Installation and Configuration
2+
3+
This directory contains the configuration files for installing OpenDataHub (ODH) with KServe support for the MaaS platform.
4+
5+
## Key Features
6+
7+
- **RawDeployment Mode**: Direct pod deployments without Knative/Serverless overhead
8+
- **NVIDIA NIM Support**: GPU-accelerated inference with NVIDIA Inference Microservices
9+
- **Headless Services**: Direct pod-to-pod communication for low latency
10+
- **OpenShift Integration**: Uses OpenShift's default ingress and certificates
11+
12+
## Installation Methods
13+
14+
### Method 1: Automated Installation (Recommended)
15+
16+
Use the provided installation script which handles all steps in the correct order:
17+
18+
```bash
19+
# From the project root
20+
./deployment/scripts/installers/install-odh.sh
21+
```
22+
23+
### Method 2: Manual Installation
24+
25+
1. **Install ODH Operator** from OperatorHub:
26+
```bash
27+
# Via OpenShift Console:
28+
# 1. Navigate to Operators → OperatorHub
29+
# 2. Search for "OpenDataHub"
30+
# 3. Install with default settings
31+
32+
# Or via CLI:
33+
oc create -f - <<EOF
34+
apiVersion: operators.coreos.com/v1alpha1
35+
kind: Subscription
36+
metadata:
37+
name: opendatahub-operator
38+
namespace: openshift-operators
39+
spec:
40+
channel: fast
41+
name: opendatahub-operator
42+
source: community-operators
43+
sourceNamespace: openshift-marketplace
44+
EOF
45+
```
46+
47+
2. **Create namespace**:
48+
```bash
49+
kubectl create namespace opendatahub
50+
```
51+
52+
3. **Wait for CRDs to be registered**:
53+
```bash
54+
# Wait for the operator to create the CRDs
55+
kubectl wait --for condition=established --timeout=300s \
56+
crd/dscinitializations.dscinitialization.opendatahub.io \
57+
crd/datascienceclusters.datasciencecluster.opendatahub.io
58+
```
59+
60+
4. **Apply the configuration**:
61+
```bash
62+
# IMPORTANT: DSCInitialization MUST be created before DataScienceCluster
63+
kubectl apply -f dscinitialisation.yaml
64+
65+
# Wait for DSCInitialization to be ready
66+
kubectl wait --for=jsonpath='{.status.phase}'=Ready \
67+
dscinitializations.dscinitialization.opendatahub.io/default-dsci \
68+
-n opendatahub --timeout=300s
69+
70+
# Now create the DataScienceCluster
71+
kubectl apply -f datasciencecluster.yaml
72+
```
73+
74+
Or use kustomize:
75+
```bash
76+
kubectl apply -k deployment/components/odh/
77+
```
78+
79+
## Troubleshooting
80+
81+
### Error: "dscinitializations.dscinitialization.opendatahub.io not found"
82+
83+
This is the most common error when creating a DataScienceCluster. It occurs when:
84+
1. The ODH operator is not installed
85+
2. The DSCInitialization resource hasn't been created yet
86+
3. The CRDs haven't been registered yet
87+
88+
**Solution**: Run the fix script:
89+
```bash
90+
./deployment/scripts/installers/fix-odh-dsci.sh
91+
```
92+
93+
This script will:
94+
- Check if the ODH operator is installed
95+
- Wait for CRDs to be registered
96+
- Create the DSCInitialization if missing
97+
- Provide next steps for creating the DataScienceCluster
98+
99+
### Manual Troubleshooting Steps
100+
101+
1. **Check operator status**:
102+
```bash
103+
kubectl get csv -n openshift-operators | grep opendatahub
104+
kubectl logs -n openshift-operators deployment/opendatahub-operator-controller-manager
105+
```
106+
107+
2. **Check CRDs**:
108+
```bash
109+
kubectl get crd | grep opendatahub
110+
```
111+
112+
3. **Check existing resources**:
113+
```bash
114+
kubectl get dscinitializations -A
115+
kubectl get datasciencecluster -A
116+
```
117+
118+
4. **Check pod status**:
119+
```bash
120+
kubectl get pods -n opendatahub
121+
kubectl get pods -n kserve
122+
```
123+
124+
## Configuration Details
125+
126+
### DSCInitialization
127+
- Configures the foundational settings for ODH
128+
- Sets up Service Mesh integration
129+
- Configures monitoring and trusted CA bundles
130+
- **MUST be created before DataScienceCluster**
131+
132+
### DataScienceCluster
133+
- Deploys the actual ODH components
134+
- Configured for KServe with:
135+
- **RawDeployment mode**: No Knative/Serverless overhead
136+
- **NIM support**: For NVIDIA GPU inference
137+
- **Headless services**: For direct pod communication
138+
- **OpenShift ingress**: Native OpenShift routing
139+
140+
### Components Status
141+
- ✅ **Enabled**: Dashboard, Workbenches, KServe (with NIM)
142+
- ❌ **Disabled**: ModelMesh, Pipelines, Ray, Kueue, Model Registry, TrustyAI, Training Operator
143+
144+
## Verification
145+
146+
After installation, verify the deployment:
147+
148+
```bash
149+
# Check DSCInitialization status
150+
kubectl get dscinitializations -n opendatahub
151+
152+
# Check DataScienceCluster status
153+
kubectl get datasciencecluster -n opendatahub
154+
155+
# Check KServe components
156+
kubectl get pods -n kserve
157+
158+
# Check if InferenceService CRD is available
159+
kubectl get crd inferenceservices.serving.kserve.io
160+
```
161+
162+
## Integration with MaaS
163+
164+
Once ODH is installed with KServe, you can:
165+
1. Deploy models using KServe InferenceService
166+
2. Use the MaaS API for model management
167+
3. Apply rate limiting and authentication policies
168+
4. Monitor model performance through the ODH dashboard
169+
170+
## Additional Resources
171+
172+
- [OpenDataHub Documentation](https://opendatahub.io/docs/)
173+
- [KServe Documentation](https://kserve.github.io/website/)
174+
- [NVIDIA NIM Documentation](https://docs.nvidia.com/nim/)
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
apiVersion: datasciencecluster.opendatahub.io/v1
2+
kind: DataScienceCluster
3+
metadata:
4+
name: default-dsc
5+
namespace: opendatahub
6+
spec:
7+
components:
8+
# Core dashboard component for managing notebooks and models
9+
dashboard:
10+
managementState: Managed
11+
12+
# Notebook controller for JupyterLab workbenches
13+
workbenches:
14+
managementState: Managed
15+
16+
# Model serving with KServe configured for RawDeployment mode
17+
# This configuration:
18+
# - Uses RawDeployment mode instead of Knative/Serverless
19+
# - Enables NVIDIA NIM support for GPU-accelerated inference
20+
# - Uses headless services for direct pod communication
21+
# - Integrates with OpenShift's default ingress
22+
kserve:
23+
managementState: Managed
24+
defaultDeploymentMode: RawDeployment # Direct pod deployments without Knative
25+
nim:
26+
managementState: Managed # Enable NVIDIA NIM (NVIDIA Inference Microservices) support
27+
rawDeploymentServiceConfig: Headless # Use headless services for raw deployments
28+
serving:
29+
ingressGateway:
30+
certificate:
31+
type: OpenshiftDefaultIngress # Use OpenShift's default ingress certificate
32+
managementState: Removed # Disable Knative serving (incompatible with RawDeployment)
33+
name: knative-serving
34+
35+
# Other components disabled for MaaS-focused deployment
36+
modelmeshserving:
37+
managementState: Removed # Use KServe instead
38+
39+
datasciencepipelines:
40+
managementState: Removed # Not needed for MaaS
41+
42+
ray:
43+
managementState: Removed # Not needed for MaaS
44+
45+
kueue:
46+
managementState: Removed # Not needed for MaaS
47+
48+
modelregistry:
49+
managementState: Removed # Not needed for MaaS
50+
51+
trustyai:
52+
managementState: Removed # Not needed for MaaS
53+
54+
trainingoperator:
55+
managementState: Removed # Not needed for MaaS
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
apiVersion: dscinitialization.opendatahub.io/v1
2+
kind: DSCInitialization
3+
metadata:
4+
name: default-dsci
5+
namespace: opendatahub
6+
spec:
7+
# Namespace where ODH applications will be deployed
8+
applicationsNamespace: opendatahub
9+
10+
# Monitoring configuration for ODH components
11+
monitoring:
12+
managementState: Managed
13+
namespace: opendatahub
14+
15+
# Service Mesh configuration for secure communication
16+
serviceMesh:
17+
managementState: Managed
18+
auth:
19+
audiences:
20+
- "https://kubernetes.default.svc"
21+
controlPlane:
22+
name: data-science-smcp
23+
namespace: istio-system
24+
25+
# Trusted CA bundle for TLS certificates
26+
trustedCABundle:
27+
managementState: Managed
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
apiVersion: kustomize.config.k8s.io/v1beta1
2+
kind: Kustomization
3+
4+
metadata:
5+
name: odh-installation
6+
7+
namespace: opendatahub
8+
9+
resources:
10+
# DSCInitialization must be created before DataScienceCluster
11+
- dscinitialisation.yaml
12+
- datasciencecluster.yaml

0 commit comments

Comments
 (0)