You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> **Stability: experimental** — This component is under active development and may change.
3
+
> ⚠️ **Stability: experimental** — This asset is not yet stable and may change.
4
4
5
-
## Overview
5
+
## Overview 🧾
6
6
7
-
A KFP component that evaluates a fine-tuned model using the
8
-
[Eval Hub](https://github.com/opendatahub-io/eval-hub) service with a
9
-
KServe InferenceService for model serving.
7
+
Evaluate a model via Eval Hub with a KServe InferenceService.
10
8
11
-
The component:
9
+
Creates a KServe ServingRuntime + InferenceService (matching the RHOAI dashboard deployment pattern) to serve the fine-tuned model from the workspace PVC. The InferenceService URL is submitted to Eval Hub for benchmark evaluation. Both resources are cleaned up after completion.
12
10
13
-
1. Creates a KServe **ServingRuntime** + **InferenceService** (matching the RHOAI
14
-
dashboard deployment pattern) to serve the fine-tuned model from the workspace PVC.
15
-
2. Submits benchmark evaluation jobs to Eval Hub, pointing at the InferenceService URL.
16
-
3. Polls for evaluation completion and collects results/metrics.
17
-
4. Optionally logs metrics to **MLflow** (when `mlflow_experiment_name` is provided).
18
-
5. Cleans up both KServe resources after evaluation (or on failure).
19
-
20
-
Both KServe resources (ServingRuntime and InferenceService) are explicitly deleted
21
-
in a `finally` block after evaluation completes or on failure.
|`mlflow_experiment_name`|`str`|`""`| MLflow experiment name (non-empty enables MLflow). |
41
31
|`gpu_count`|`int`|`1`| Number of GPUs for the InferenceService predictor. |
42
-
|`memory`|`str`|`"8Gi"`| Pod memory request/limit for the predictor. |
43
-
|`cpu`|`str`|`"2"`| CPU request/limit for the predictor. |
44
-
|`runtime_image`|`str`| RHOAI vLLM image | Container image for the ServingRuntime. |
32
+
|`memory`|`str`|`8Gi`| Pod memory request/limit for the predictor (e.g. "8Gi", "32Gi"). |
33
+
|`cpu`|`str`|`2`| CPU request/limit for the predictor (e.g. "2"). |
34
+
|`runtime_image`|`str`|`registry.redhat.io/rhaii/vllm-cuda-rhel9@sha256:ad06abf3bb5235ebb5b2df84cd1b9fd09e823f0ff2eebfc82bb4590275ccfe0b`| Container image for the ServingRuntime (RHOAI vLLM default). |
35
+
|`trust_remote_code`|`bool`|`False`| Pass --trust-remote-code to vLLM (enables arbitrary code from model repos). |
36
+
|`verify_tls`|`bool`|`False`| Verify TLS certificates for Eval Hub API calls (False for self-signed certs). |
45
37
|`isvc_ready_timeout`|`int`|`600`| Max seconds to wait for InferenceService readiness. |
46
38
47
-
## Outputs
48
-
49
-
| Artifact | Type | Description |
50
-
| -------- | ---- | ----------- |
51
-
|`output_metrics`|`dsl.Metrics`| Evaluation scores as KFP metrics (logged per benchmark). |
52
-
|`output_results`|`dsl.Artifact`| Full evaluation results JSON from Eval Hub. |
53
-
54
-
## Prerequisites
55
-
56
-
1.**Eval Hub** installed on the cluster (operator + CR in the target namespace).
57
-
58
-
2.**KServe** available (included with RHOAI by default).
59
-
60
-
3.**RBAC** — the pipeline ServiceAccount needs permissions for KServe resources:
0 commit comments