- Implements the KServe single-model serving platform as a pluggable extension of
@odh-dashboard/model-serving. - Provides UI logic and Kubernetes API usage to deploy, configure, monitor, and delete models using KServe
InferenceServiceresources (one dedicated model server per deployment).
- No Go BFF: all Kubernetes work (
InferenceService,ServingRuntime,Pod,Secret,ServiceAccount,Role,RoleBinding) goes through the main dashboard backend proxy at/api/k8s/using@openshift/dynamic-plugin-sdk-utilsfrom the browser. - Deployment flow:
model-servingcollects wizard data and callsdeployKServeDeployment(via themodel-serving.deployment/deployextension).- Orchestration creates or patches
ServingRuntimeandInferenceService, then optional token-auth resources. - After deploy,
useWatchDeploymentswatchesInferenceService,ServingRuntime, and labeledPodresources and merges them intoKServeDeploymentobjects.
- No package Module Federation remote:
model-servingconsumes this package as a TypeScript dependency;extensions.tsexports the extension manifest registered by the parent.- Notable exports:
@odh-dashboard/kserve/extensions,@odh-dashboard/kserve/deployUtils(shared helpers for other packages).
- Notable exports:
- Feature gating:
OdhDashboardConfig(disableKServe,disableKServeMetrics) andSupportedAreavalues (K_SERVE,K_SERVE_METRICS) inextensions.tsviaflags.required.
| Term | Definition |
|---|---|
| InferenceService | Core KServe CRD (serving.kserve.io/v1beta1): model, runtime, storage, replicas, scaling for one deployment. |
| ServingRuntime | Namespace-scoped KServe CRD; template for the model server container. Each KServe deployment creates its own alongside the InferenceService. |
| KServeDeployment | TypeScript type combining InferenceService, optional ServingRuntime, status, and endpoints. |
| Predictor | spec.predictor on InferenceService: model URI, runtime ref, resources, replicas, deployment strategy. |
| Token auth | Optional bearer protection via ServiceAccount, Role, RoleBinding, Secret; security.opendatahub.io/enable-auth annotation. |
| External route | Inference exposed outside cluster via networking.kserve.io/visibility: exposed on InferenceService. |
| Deployment strategy | RollingUpdate or Recreate in spec.predictor.deploymentStrategy. |
| AI asset | InferenceService tagged opendatahub.io/genai-asset: true for Gen AI; optional opendatahub.io/genai-use-case. |
| KSERVE_ID | Platform id 'kserve' in extension registrations and modelServingPlatformId. |
| Dependency | Type | Details |
|---|---|---|
@odh-dashboard/model-serving |
Parent package | Registers KServe extensions; wizard, types, auth utilities |
@odh-dashboard/internal |
Shared library | K8s types, API helpers, feature flags, RBAC |
Main dashboard backend (/api/k8s/) |
HTTP proxy | All cluster API traffic from the browser |
@odh-dashboard/model-registry |
Sibling package | Pre-filled deploy wizard data for registered model versions |
InferenceService, ServingRuntime, Pod, Secret, ServiceAccount, Role, RoleBinding |
Kubernetes API | CRUD / watch as described in Design Intent |
- ServingRuntime coupling: Deleting a deployment removes both CRs; if
ServingRuntimedelete fails, an orphan may remain and be reused on redeploy with the same name. - Dry-run RoleBinding 404: With
dryRunand non-admin users,RoleBindingcreation can 404 when theRoleis absent in dry-run;createRoleBindingIfMissingindeployUtils.tsswallows that so validation can proceed. - No standalone mode: Must run main dashboard +
model-serving. modelmesh-enabled: false: Project needs this label for KServe to be active; otherwise ModelMesh is default (set when choosing single-model serving in project settings).patchInferenceServiceno-op: Zero JSON patches skips the network call and returns the new object.