You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support additional slurm features (priority, email notifications, dependency), update primerl pin (#6)
* Add SLURM priority/mail/resume/account options, switch to gpus-per-task with 16 CPU/GPU defaults, and sync docs/tests.
* Add dependency and test-only flags to medarc_slurm
* formatting fixes
* group arguments, support passing primerl config as arg or option
* update primerl, now uses IPO loss
* update templates
* match upstream config override behavior
# Validate an RL submission (including dependency syntax) without creating a job
63
+
medarc_slurm rl --config config.toml \
64
+
--output-dir runs/my-rl \
65
+
--train-gpus 1 \
66
+
--infer-gpus 2 \
67
+
--dependency afterok:123456 \
68
+
--test-only
51
69
```
52
70
53
71
Generated artifacts are written to `--output-dir`:
54
72
-`sft.sh` or `rl.sh` — the SLURM batch script
55
73
-`configs/` — resolved TOML subconfigs passed to each component
56
74
57
-
You can pass PRIME-RL config overrides directly as extra flags (for example `--wandb.project my-proj --wandb.name my-run`). You may also insert `--` before passthrough overrides for readability, but it is optional.
75
+
You can pass PRIME-RL config overrides directly as extra flags (for example `--wandb.project my-proj --wandb.name my-run`). You may also insert `--` before passthrough overrides for readability, but it is optional. To layer multiple PRIME-RL configs, repeat `--config` with later files overriding earlier ones.
76
+
77
+
`medarc_slurm` now defaults `--account` to `training`. You can override it with `--account <name>`.
78
+
Email mode is `--mail all` or `--mail begin_end` (with `--mail-user`).
79
+
Use `--dependency "<expr>"` to pass SLURM dependencies and `--test-only` to run `sbatch` validation without submitting.
58
80
59
81
Run `medarc_slurm sft --help` or `medarc_slurm rl --help` for more details on available options.
config_toml: Annotated[Path, Argument(metavar="CONFIG_TOML", help="Path to the PRIME-RL SFT trainer TOML.")],
37
56
output_dir: Annotated[Path, Option("--output-dir", file_okay=False, dir_okay=True, help="Directory to write resolved configs and checkpoints.")],
57
+
config: Annotated[list[Path] |None, Option("--config", "--config-toml", help="One or more PRIME-RL SFT trainer TOMLs. Repeat `--config` to layer files with later files overriding earlier ones.")] =None,
38
58
gpus: Annotated[int, Option("--gpus", min=1, max=8, help="Number of GPUs for SFT.")] =1,
59
+
resume: Annotated[bool, Option("--resume/--no-resume", help="Resume from the latest checkpoint (sets ckpt.resume_step=-1).")] =False,
39
60
) ->None: # fmt: skip
40
61
fromprime_rl.configs.sftimportSFTConfig
41
62
63
+
config_tomls=list(configor [])
64
+
ifnotconfig_tomls:
65
+
raisetyper.BadParameter("Missing config path. Pass one or more --config values.", param_hint="--config")
config_toml: Annotated[Path, Argument(metavar="CONFIG_TOML", help="Path to the PRIME-RL RL TOML.")],
86
110
output_dir: Annotated[Path, Option("--output-dir", file_okay=False, dir_okay=True, help="Directory to write resolved configs and checkpoints.")],
111
+
config: Annotated[list[Path] |None, Option("--config", "--config-toml", help="One or more PRIME-RL RL TOMLs. Repeat `--config` to layer files with later files overriding earlier ones.")] =None,
87
112
train_gpus: Annotated[int, Option("--train-gpus", min=1, max=4, help="Number of GPUs for training.")] =1,
88
113
infer_gpus: Annotated[int, Option("--infer-gpus", min=1, max=7, help="Number of GPUs for inference.")] =1,
89
114
single_gpu: Annotated[bool, Option("--single-gpu", help="Share a single GPU between trainer and inference.")] =False,
115
+
resume: Annotated[bool, Option("--resume/--no-resume", help="Resume from the latest checkpoint (sets ckpt.resume_step=-1).")] =False,
90
116
) ->None: # fmt: skip
91
117
fromprime_rl.configs.rlimportRLConfig
92
118
93
119
frommedarc_rl.launchers.rl_localimportrl_local
94
120
121
+
config_tomls=list(configor [])
122
+
ifnotconfig_tomls:
123
+
raisetyper.BadParameter("Missing config path. Pass one or more --config values.", param_hint="--config")
0 commit comments