Skip to content

Commit e99c3b5

Browse files
Markdown lint fix
1 parent a08a110 commit e99c3b5

1 file changed

Lines changed: 13 additions & 13 deletions

File tree

docs/loadtesting.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -102,19 +102,19 @@ The model-ready health check waits up to 1000000 seconds by default. Override it
102102

103103
Built-in scenarios are packaged under `src/swiss_ai_model_launch/assets/scenarios`.
104104

105-
| Scenario | Pattern | Duration | Think time | Max tokens | Prompt labels | Use case |
106-
| ------------ | ------------------------------- | -------- | ---------- | ---------- | ------------------------------------- | --------------------------------------------------------- |
107-
| `throughput` | 20 constant VUs | 15m | 2s | 2048 | all | Baseline sustained throughput. |
108-
| `ramp` | 0 -> 10 -> 25 -> 50 VUs | 16m | 2s | 2048 | all | Gradual capacity ramp with plateaus. |
109-
| `stress` | 0 -> 20 -> 50 -> 100 -> 150 VUs | 16m | 2s | 2048 | all | Push the service past normal operating load. |
110-
| `spike` | 10 -> 100 -> 10 VUs | 8m30s | 0s | 4096 | all | Sudden traffic surge and recovery behavior. |
111-
| `soak` | 20 constant VUs | 30m | 2s | 2048 | all | Longer stability run for drift, leaks, and tail latency. |
112-
| `decode` | 50 constant VUs | 15m | 0s | 4096 | `short`, `medium` | Decode-heavy run with shorter prompts and longer outputs. |
113-
| `kv_stress` | 0 -> 30 -> 0 VUs | 15m | 0s | 4096 | `long_input`, `xl_input`, `conv_long` | KV-cache pressure with long inputs and long outputs. |
114-
| `open_loop` | 20 arrivals/s | 15m | 0s | 2048 | all | Fixed request-rate latency test with EOS ignored. |
115-
| `open_loop_ramp` | 2 -> 30 arrivals/s | 15m | 0s | 2048 | all | Open-loop capacity sweep with EOS ignored. |
116-
| `open_loop_decode` | 2 -> 5 arrivals/s | 12m | 0s | 512 | `short`, `medium` | Open-loop decode-focused A/B benchmark. |
117-
| `realistic` | 20 constant VUs | 15m | 30s | 2048 | all | Lower-pressure interactive traffic shape. |
105+
| Scenario | Pattern | Duration | Think time | Max tokens | Prompt labels | Use case |
106+
| ------------------ | ------------------------------- | -------- | ---------- | ---------- | ------------------------------------- | --------------------------------------------------------- |
107+
| `throughput` | 20 constant VUs | 15m | 2s | 2048 | all | Baseline sustained throughput. |
108+
| `ramp` | 0 -> 10 -> 25 -> 50 VUs | 16m | 2s | 2048 | all | Gradual capacity ramp with plateaus. |
109+
| `stress` | 0 -> 20 -> 50 -> 100 -> 150 VUs | 16m | 2s | 2048 | all | Push the service past normal operating load. |
110+
| `spike` | 10 -> 100 -> 10 VUs | 8m30s | 0s | 4096 | all | Sudden traffic surge and recovery behavior. |
111+
| `soak` | 20 constant VUs | 30m | 2s | 2048 | all | Longer stability run for drift, leaks, and tail latency. |
112+
| `decode` | 50 constant VUs | 15m | 0s | 4096 | `short`, `medium` | Decode-heavy run with shorter prompts and longer outputs. |
113+
| `kv_stress` | 0 -> 30 -> 0 VUs | 15m | 0s | 4096 | `long_input`, `xl_input`, `conv_long` | KV-cache pressure with long inputs and long outputs. |
114+
| `open_loop` | 20 arrivals/s | 15m | 0s | 2048 | all | Fixed request-rate latency test with EOS ignored. |
115+
| `open_loop_ramp` | 2 -> 30 arrivals/s | 15m | 0s | 2048 | all | Open-loop capacity sweep with EOS ignored. |
116+
| `open_loop_decode` | 2 -> 5 arrivals/s | 12m | 0s | 512 | `short`, `medium` | Open-loop decode-focused A/B benchmark. |
117+
| `realistic` | 20 constant VUs | 15m | 30s | 2048 | all | Lower-pressure interactive traffic shape. |
118118

119119
Custom scenarios can be placed in `./scenarios/` where you run `sml`. Use YAML, YML, or JSON. A custom scenario with the same name overrides the built-in one.
120120

0 commit comments

Comments
 (0)