|
| 1 | +## 🚀 Deploying `inference-perf` via Helm Chart |
| 2 | + |
| 3 | +This guide explains how to deploy `inference-perf` to a Kubernetes cluster with Helm. |
| 4 | + |
| 5 | +Note: This is a temporary chart added until remote chart is available. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +### 1. Prerequisites |
| 10 | + |
| 11 | +Make sure you have the following tools installed and configured: |
| 12 | + |
| 13 | +* **Kubernetes Cluster:** Access to a functional cluster (e.g., GKE). |
| 14 | +* **Helm:** The Helm CLI installed locally. |
| 15 | + |
| 16 | +--- |
| 17 | + |
| 18 | +### 2. Configuration (`values.yaml`) |
| 19 | + |
| 20 | +Before deployment, navigate to the **`deploy/inference-perf`** directory and edit the **`values.yaml`** file to customize your deployment and the benchmark parameters. |
| 21 | + |
| 22 | +#### Optional Parameters |
| 23 | + |
| 24 | +| Key | Description | Default | |
| 25 | +| :--- | :--- | :--- | |
| 26 | +| `hfToken` | Hugging Face API token. If provided, a Kubernetes `Secret` named `hf-token-secret` will be created for authentication. | `""` | |
| 27 | +| `serviceAccountName` | Standard Kubernetes `serviceAccountName`. If not provided, default service account is used. | `""` | |
| 28 | +| `nodeSelector` | Standard Kubernetes `nodeSelector` map to constrain pod placement to nodes with matching labels. | `{}` | |
| 29 | +| `resources` | Standard Kubernetes resource requests and limits for the main `inference-perf` container. | `{}` | |
| 30 | +--- |
| 31 | + |
| 32 | +> **Example Resource Block:** |
| 33 | +> ```yaml |
| 34 | +> # resources: |
| 35 | +> # requests: |
| 36 | +> # cpu: "1" |
| 37 | +> # memory: "4Gi" |
| 38 | +> # limits: |
| 39 | +> # cpu: "2" |
| 40 | +> # memory: "8Gi" |
| 41 | +> ``` |
| 42 | +
|
| 43 | +#### GKE Specific Parameters |
| 44 | +
|
| 45 | +This section details the necessary configuration and permissions for using a Google Cloud Storage (GCS) path to manage your dataset, typical for deployments on GKE. |
| 46 | +
|
| 47 | +##### Required IAM Permissions |
| 48 | +
|
| 49 | +The identity executing the workload (e.g., the associated Kubernetes Service Account, often configured via **Workload Identity**) must possess the following IAM roles on the target GCS bucket for data transfer: |
| 50 | +
|
| 51 | +* **`roles/storage.objectViewer`** (Required to read/download the input dataset from GCS). |
| 52 | +* **`roles/storage.objectCreator`** (Required to write/push benchmark results back to GCS). |
| 53 | +
|
| 54 | +
|
| 55 | +| Key | Description | Default | |
| 56 | +| :--- | :--- | :--- | |
| 57 | +| `gcsPath` | A GCS URI pointing to the dataset file (e.g., `gs://my-bucket/dataset.json`). The file will be automatically copied to the running pod during initialization. | `""` | |
| 58 | +
|
| 59 | +--- |
| 60 | +
|
| 61 | +### 3. Run Deployment |
| 62 | +
|
| 63 | +Use the **`helm install`** command from the **`deploy/inference-perf`** directory to deploy the chart. |
| 64 | +
|
| 65 | +* **Standard Install:** Deploy using the default `values.yaml`. |
| 66 | + ```bash |
| 67 | + helm install test . |
| 68 | + ``` |
| 69 | +
|
| 70 | +* **Set `hfToken` Override:** Pass the Hugging Face token directly. |
| 71 | + ```bash |
| 72 | + helm install test . --set hfToken="<TOKEN>" |
| 73 | + ``` |
| 74 | +
|
| 75 | +* **Custom Config Override:** Make changes to the values file for custom settings. |
| 76 | + ```bash |
| 77 | + helm install test . -f values.yaml |
| 78 | + ``` |
| 79 | +
|
| 80 | +### 4. Cleanup |
| 81 | +
|
| 82 | +To remove the benchmark deployment. |
| 83 | + ```bash |
| 84 | + helm uninstall test |
| 85 | + ``` |
0 commit comments