Skip to content

Commit fad75e6

Browse files
committed
Add benchmarking folder with common config set ups
1 parent 831a919 commit fad75e6

File tree

5 files changed

+812
-15
lines changed

5 files changed

+812
-15
lines changed

benchmarking/README.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Benchmarking Helm Chart
2+
3+
This Helm chart deploys the `inference-perf` benchmarking tool. This guide will walk you through deploying a basic benchmarking job. By default, the `shareGPT` dataset is used for configuration.
4+
5+
## Prerequisites
6+
7+
Before you begin, ensure you have the following:
8+
9+
* **Helm 3+**: [Installation Guide](https://helm.sh/docs/intro/install/)
10+
* **Kubernetes Cluster**: Access to a Kubernetes cluster
11+
* **Gateway Deployed**: Your inference server/gateway must be deployed and accessible within the cluster.
12+
* **Hugging Face Token Secret**: A Hugging Face token to pull tokenizers.
13+
14+
## Deployment
15+
16+
To deploy the benchmarking chart:
17+
18+
```bash
19+
export IP='<YOUR_IP>'
20+
export PORT='<YOUR_PORT>'
21+
export HF_TOKEN='<YOUR HUGGING_FACE_TOKEN>'
22+
export CHART_VERSION=v0.2.0
23+
helm install benchmark -f benchmark-values.yaml \
24+
--set hftoken=${HF_TOKEN} \
25+
--set "config.server.base_url=http://${IP}:${PORT}" \
26+
oci://quay.io/inference-perf/charts/inference-perf:${CHART_VERSION}
27+
```
28+
29+
**Parameters to customize:**
30+
31+
For more parameter customizations, refer to inference-perf [guides](https://github.com/kubernetes-sigs/inference-perf/blob/main/docs/config.md)
32+
33+
* `benchmark`: A unique name for this deployment.
34+
* `hfToken`: Your hugging face token.
35+
* `config.server.base_url`: The base URL (IP and port) of your inference server.
36+
37+
### Storage Parameters
38+
39+
#### 1. Local Storage (Default)
40+
41+
By default, reports are saved locally but **lost when the Pod terminates**.
42+
```yaml
43+
storage:
44+
local_storage:
45+
path: "reports-{timestamp}" # Local directory path
46+
report_file_prefix: null # Optional filename prefix
47+
```
48+
49+
#### 2. Google Cloud Storage (GCS)
50+
51+
Use the `google_cloud_storage` block to save reports to a GCS bucket.
52+
53+
```yaml
54+
storage:
55+
google_cloud_storage: # Optional GCS configuration
56+
bucket_name: "your-bucket-name" # Required GCS bucket
57+
path: "reports-{timestamp}" # Optional path prefix
58+
report_file_prefix: null # Optional filename prefix
59+
```
60+
61+
###### 🚨 GCS Permissions Checklist (Required for Write Access)
62+
63+
1. **IAM Role (Service Account):** Bound to the target bucket.
64+
65+
* **Minimum:** **Storage Object Creator** (`roles/storage.objectCreator`)
66+
67+
* **Full:** **Storage Object Admin** (`roles/storage.objectAdmin`)
68+
69+
2. **Node Access Scope (GKE Node Pool):** Set during node pool creation.
70+
71+
* **Required Scope:** **`devstorage.read_write`** or **`cloud-platform`**
72+
73+
#### 3. Simple Storage Service (S3)
74+
75+
Use the `simple_storage_service` block for S3-compatible storage. Requires appropriate AWS credentials configured in the runtime environment.
76+
77+
```yaml
78+
storage:
79+
simple_storage_service:
80+
bucket_name: "your-bucket-name" # Required S3 bucket
81+
path: "reports-{timestamp}" # Optional path prefix
82+
report_file_prefix: null # Optional filename prefix
83+
```
84+
85+
## Uninstalling the Chart
86+
87+
To uninstall the deployed chart:
88+
89+
```bash
90+
helm uninstall my-benchmark
91+
```
92+

benchmarking/benchmark-values.yaml

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
job:
2+
image:
3+
repository: quay.io/inference-perf/inference-perf
4+
tag: "" # Defaults to .Chart.AppVersion
5+
nodeSelector: {}
6+
# Example resources:
7+
# resources:
8+
# requests:
9+
# cpu: "1"
10+
# memory: "4Gi"
11+
# limits:
12+
# cpu: "2"
13+
# memory: "8Gi"
14+
resources: {}
15+
16+
logLevel: INFO
17+
18+
# A GCS bucket path that points to the dataset file.
19+
# The file will be copied from this path to the local file system
20+
# at /dataset/dataset.json for use during the run.
21+
# NOTE: For this dataset to be used, config.data.path must also be explicitly set to /dataset/dataset.json.
22+
gcsPath: ""
23+
24+
# hfToken optionally creates a secret with the specified token.
25+
# Can be set using helm install --set hftoken=<token>
26+
hfToken: ""
27+
28+
config:
29+
load:
30+
type: constant
31+
interval: 15
32+
stages:
33+
- rate: 10
34+
duration: 20
35+
- rate: 20
36+
duration: 20
37+
- rate: 30
38+
duration: 20
39+
api:
40+
type: completion
41+
streaming: true
42+
server:
43+
type: vllm
44+
model_name: meta-llama/Llama-3.1-8B-Instruct
45+
base_url: http://0.0.0.0:8000
46+
ignore_eos: true
47+
tokenizer:
48+
pretrained_model_name_or_path: meta-llama/Llama-3.1-8B-Instruct
49+
data:
50+
type: shareGPT
51+
metrics:
52+
type: prometheus
53+
prometheus:
54+
google_managed: true
55+
report:
56+
request_lifecycle:
57+
summary: true
58+
per_stage: true
59+
per_request: true
60+
prometheus:
61+
summary: true
62+
per_stage: true

0 commit comments

Comments
 (0)