Commit 43f7d65
committed
[multi-lora] Restore v1 sampling guards + add SEQ-vs-ALT min repro test
Two fixes:
1. Restore the v1 single-tenant sampling guards in skyrl_train_backend.py
that the merge from origin/main accidentally dropped:
- sample() returns ErrorResponse when LoRA is active and >1 adapter
is registered.
- save_sampler_checkpoint raises ValueError under the same condition.
Multi-tenant inference is the RL follow-up (NovaSky-AI#1621); SFT v1 must
refuse it explicitly rather than silently corrupting state.
test_sample_with_two_adapters_errors had been passing in earlier runs
only by accident — restore the actual guarantee.
2. Add test_seq_vs_alt_per_adapter_step_isolation: min repro of the
SEQ-vs-ALT divergence flagged in ~/skyrl-seq-vs-alt-repro (against
Qwen3-4B + PPO on a real pod). Two fresh adapters, ALT-style
sequence, identical data, asserts pre-update losses match within
1e-2 at every step. With AdapterStore snapshotting state['step'] per
slot, this passes on the tiny model — step 0 is bit-exact, step 1
diverges by 1.7e-4 (three orders of magnitude below the user's
Qwen3-4B observation). If a future change leaks a global step
counter across adapters, this test will fail loudly; the assertion
message points at the SEQ-vs-ALT diagnosis.
Local: 5/5 pass in ~2m on 1x B200.1 parent 76dc375 commit 43f7d65
2 files changed
Lines changed: 78 additions & 13 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
249 | 249 | | |
250 | 250 | | |
251 | 251 | | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
| 252 | + | |
256 | 253 | | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
257 | 258 | | |
258 | 259 | | |
259 | 260 | | |
| |||
354 | 355 | | |
355 | 356 | | |
356 | 357 | | |
357 | | - | |
358 | | - | |
359 | | - | |
360 | | - | |
| 358 | + | |
| 359 | + | |
361 | 360 | | |
362 | 361 | | |
363 | 362 | | |
| |||
390 | 389 | | |
391 | 390 | | |
392 | 391 | | |
393 | | - | |
394 | | - | |
| 392 | + | |
| 393 | + | |
395 | 394 | | |
396 | 395 | | |
397 | 396 | | |
| |||
877 | 876 | | |
878 | 877 | | |
879 | 878 | | |
880 | | - | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
881 | 882 | | |
882 | 883 | | |
883 | 884 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
18 | 22 | | |
19 | 23 | | |
20 | 24 | | |
| |||
47 | 51 | | |
48 | 52 | | |
49 | 53 | | |
50 | | - | |
51 | | - | |
| 54 | + | |
| 55 | + | |
52 | 56 | | |
53 | 57 | | |
54 | 58 | | |
| |||
200 | 204 | | |
201 | 205 | | |
202 | 206 | | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
203 | 267 | | |
204 | 268 | | |
205 | 269 | | |
| |||
0 commit comments