You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-`testlist`: test-db YAML filename without `.yml` extension
153
+
-`testCount`: must equal the number of **active (uncommented)** tests in the test-db file (each disagg test gets its own CI stage)
154
+
-`gpuCount`: total GPUs allocated per stage = `total_nodes * gpus_per_node`
155
+
-`nodeCount`: total SLURM nodes per stage
156
+
157
+
When adding a test, either increment `testCount` on an existing entry or add a new `buildStageConfigs` block. Stages are grouped by node count (2 Nodes, 3 Nodes, 4 Nodes, etc.).
158
+
159
+
For the full step-by-step guide including how to derive test-db filenames and GPU/node counts from disaggregated config YAMLs, see [`tests/scripts/perf-sanity/README.md`](../../tests/scripts/perf-sanity/README.md) ("Step-by-Step: Adding or Re-enabling Disaggregated Perf Sanity Tests").
Copy file name to clipboardExpand all lines: tests/scripts/perf-sanity/README.md
+120Lines changed: 120 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -233,3 +233,123 @@ When working with perf sanity tests, use these paths:
233
233
| Local submit (all) |`jenkins/scripts/perf/local/submit.py`|
234
234
| Jenkins pipeline |`jenkins/L0_Test.groovy`|
235
235
| Test database |`tests/integration/test_lists/test-db/`|
236
+
| Test waives |`tests/integration/test_lists/waives.txt`|
237
+
238
+
## Step-by-Step: Adding or Re-enabling Disaggregated Perf Sanity Tests
239
+
240
+
When adding a new disaggregated perf sanity test (or uncommenting an existing one), you must update **two files**: the test-db YAML and `jenkins/L0_Test.groovy`. This section describes how to locate and edit each one.
241
+
242
+
### Step 1: Identify the Disaggregated Config YAML
243
+
244
+
Config files live in `tests/scripts/perf-sanity/disaggregated/`. The filename encodes the GPU type and test parameters:
- If the test line already exists but is **commented out** (prefixed with `# `), remove the `# ` prefix.
310
+
- If the test line does not exist, add it to the `tests` list.
311
+
- Count the total number of **active (uncommented) tests** in the file — you will need this count for Step 5.
312
+
313
+
### Step 5: Update `jenkins/L0_Test.groovy`
314
+
315
+
Open `jenkins/L0_Test.groovy` and search for the `multiNodesSBSAConfigs` section inside `launchTestJobs()`. Disaggregated perf sanity stages are added via `buildStageConfigs()`:
| `testCount` | Number of **active (uncommented)** tests in the test-db file. Each disagg test gets its own stage, so `testCount` must equal the number of active tests. |
327
+
| `gpuCount` | Total GPUs from Step 2 (= `total_nodes * gpus_per_node`) |
**If a `buildStageConfigs` entry already exists** for the test-db file: update `testCount` to match the new total number of active tests.
337
+
338
+
**If no entry exists** for the test-db file: add a new `buildStageConfigs` block. Insert it in the correct section sorted by node count (2 Nodes, 3 Nodes, 4 Nodes, etc.).
339
+
340
+
### Step 6: Check Waives
341
+
342
+
Search `tests/integration/test_lists/waives.txt` for the exact test case string. If the test is listed there with a `SKIP` directive, remove that line (otherwise the test will be skipped even if present in the test-db).
343
+
344
+
### Worked Example
345
+
346
+
Adding back `qwen3-235b-fp4_8k1k_con64_ctx1_tp1_gen1_tep4_eplb0_mtp0_ccb-UCX` as a gen_only test:
5. Uncomment the line: `- perf/test_perf_sanity.py::test_e2e[disagg_upload-gen_only-gb200_qwen3-235b-fp4_8k1k_con64_ctx1_tp1_gen1_tep4_eplb0_mtp0_ccb-UCX] TIMEOUT (120)`
353
+
6. Count active tests in that file (now 4)
354
+
7. In `L0_Test.groovy`, find the existing `buildStageConfigs` for `l0_gb200_multi_nodes_perf_sanity_ctx1_node1_gpu1_gen1_node1_gpu4`, update `testCount` from 3 to 4
0 commit comments