[skills] gres: defer to #1599's internal/slurm/<cluster>, keep one-line fallback

Edwardf0t1 · claude · Edwardf0t1 · commit 9ccf032d1e02 · 2026-06-02T23:09:16.000-07:00
Reduce the gres guidance to a single fallback note (the slurm/default case), deferring the predefined per-cluster config path to PR #1599 (which pre-fills gres/hostname/partition). Fills the gap #1599's fallback branch leaves (it does not mention gres for the no-internal-package case). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
diff --git a/.claude/skills/evaluation/SKILL.md b/.claude/skills/evaluation/SKILL.md
@@ -200,7 +200,7 @@ Reasoning models: prefer reasoning mode (highest scores). For lower variance / c
 - Find every `???` left. Ask the user only for what can't be inferred (SLURM hostname/account/output_dir, MLflow tracking URI, etc.). Don't propose defaults; let them give plain text.
 - **`parallelism`** — size it yourself from the run shape (total requests = `dataset_size × repeats` vs GPU serving capacity), and set `--max-num-seqs` to match. Read `references/parallelism.md` for the decision rule and worked examples; only ask the user if a non-GPU cap (e.g. judge rate limit) is unknown.
 - Ask about other defaults they may want to change (partition, walltime, MLflow tags).
-- **`execution.gres`** — if your NEL install ships an `internal/slurm/<cluster>` execution config, prefer it (it pre-fills `gres`/hostname/partition/node-exclusivity). Otherwise NEL defaults to `gpu:8`; set it to the node's GPU count (and match `--data-parallel-size`/`--tensor-parallel-size`), or `sbatch` rejects the job with *"Requested node configuration is not available"* (e.g. `gpu:8` on 4-GPU GB300 nodes → `gres: gpu:4`; check with `sinfo -o '%P %G'`).
+- **`execution.gres`** — auto-set if you used a predefined `internal/slurm/<cluster>` config (above). On the `slurm/default` fallback it's `gpu:8`, so set it to the node's GPU count (and match `--data-parallel-size`/`--tensor-parallel-size`) or `sbatch` rejects the job with *"Requested node configuration is not available"* (e.g. 4-GPU GB300 → `gres: gpu:4`; check with `sinfo -o '%P %G'`).
 
 **Walltime cap: 4 hours.** Always `execution.walltime: "04:00:00"`. The cluster does not schedule jobs longer than 4h — this is a hard limit, not a preference.
 
diff --git a/.claude/skills/evaluation/recipes/examples/example_eval.yaml b/.claude/skills/evaluation/recipes/examples/example_eval.yaml
@@ -50,10 +50,9 @@ execution:
   account: ???
   output_dir: ???
   walltime: "04:00:00"
-  # gres defaults to gpu:8. If your NEL install ships an internal/slurm/<cluster>
-  # execution config, prefer it (it pre-fills gres/hostname/partition). Otherwise
-  # set gres to the node's GPU count or sbatch fails "Requested node configuration
-  # is not available"; match --tensor/--data-parallel-size to it (references/parallelism.md).
+  # gres: a predefined internal/slurm/<cluster> config (see SKILL Step 4) sets this.
+  # On the slurm/default fallback it's gpu:8 — set to the node's GPU count or sbatch
+  # fails "Requested node configuration is not available" (4-GPU GB300 -> gpu:4).
   mounts:
     mount_home: false
   auto_export:          # REQUIRED trigger for auto-export. Without this, the