Commit f9cf81b
authored
[feat][inf] Multi-LoRA serving for RemoteInferenceClient (NovaSky-AI#1579)
## Summary
Adds dynamic multi-LoRA serving to the new inference path:
`RemoteInferenceClient` can now load/unload arbitrary LoRA adapters at
runtime, route every data-plane call to a specific adapter (or the base
model) via an explicit `model` parameter, and serve from multiple
adapters concurrently.
## Key design decision: `model` is always explicit
Every data-plane method requires the caller to identify the target model
— there is no implicit fallback to a client-side default.
| Method | How `model` is supplied |
|---|---|
| `generate(input_batch, model)` | Required keyword argument |
| `sample(payload)`, `chat_completion(payload)`, `completion(payload)`,
`render_chat_completion(payload)` | Required field in the request body;
missing/empty raises `ValueError` |
Callers know (or can resolve) which adapter they are addressing — there
is no client-side guessing about which LoRA is "active".
`RemoteInferenceClient.model_name` is now *only* the base model the vLLM
server was started with; it is consumed internally by
`tokenize`/`detokenize` (LoRA-agnostic) and is no longer auto-injected
into data-plane requests.
## What's new
- **Control plane** on `RemoteInferenceClient`:
- `load_lora_adapter(name, path, load_inplace=False)` — fans out to all
backend servers via `POST /v1/load_lora_adapter`.
- `unload_lora_adapter(name)` — symmetric `POST
/v1/unload_lora_adapter`.
- **Config knobs** plumbed to vLLM:
- `trainer.policy.model.lora.max_loras` (concurrent adapters per batch)
- `trainer.policy.model.lora.max_cpu_loras` (CPU LRU cache size)
- **`SKYRL_LORA_ADAPTER_NAME`** promoted to a public module-level
constant; FSDP and Megatron (unmerged) workers register their trained
adapter under this name.
## Single source of truth for "what does the policy resolve to?"
Added `resolve_policy_model_name(cfg)` in
`skyrl/backends/skyrl_train/inference_servers/utils.py`. It returns:
- `SKYRL_LORA_ADAPTER_NAME` when the worker registers a LoRA adapter on
the inference engine (FSDP + LoRA, or Megatron + LoRA with
`merge_lora=False`).
- `cfg.trainer.policy.model.path` otherwise (including Megatron + LoRA
with `merge_lora=True`, where the engine receives merged base weights).
This is called **once at wiring time** and threaded through:
- `skyrl_train_backend._sample_with_remote_client`
- `SkyRLGymGenerator` / `SkyRLVLMGymGenerator` (new required
`policy_model_name: str` constructor arg)
- `main_base.get_generator`
- `SkyRLBackend` in `skyrl-agent`
The `is_single_lora` ternary that previously decided `client.model_name`
at construction is gone from both `skyrl_train_backend` and `main_base`.
## Breaking change: legacy inference path
The legacy `InferenceEngineClient` (Ray-wrapped) path is **not updated**
for multi-LoRA. We assume `_SKYRL_USE_NEW_INFERENCE=1` (the current
default) is in use everywhere. The legacy `LoraLoadRequest` smuggling
path still works for the single-LoRA legacy flow, but anything calling
the legacy client's `generate`/`sample`/etc. without `model=` plumbing
will need updating before that path can serve LoRAs.
## Tests
- **Mock tests**
(`tests/backends/skyrl_train/inference_servers/test_remote_inference_client.py`):
- `TestLoRAControlPlane` — fan-out, in-place reload, conflict on
`load_inplace=False`, fan-out unload, 404 on unknown unload.
- `TestExplicitModelRequired` — every body-style method raises
`ValueError` when `model` is missing; `generate` raises `TypeError`
without the kwarg.
- All existing + new mock tests pass.
- **Real-GPU multi-LoRA tests**
(`tests/backends/skyrl_train/gpu/gpu_ci/inference_servers/test_multi_lora_serving.py`):
- `test_multi_lora_interleaved_generation` — load `lora-meow` +
`lora-woof`, interleave per-call routing, verify each adapter's
signature output.
- `test_lora_inplace_reload_isolated` — in-place reload of one adapter
must not perturb the other.
---------
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: hao-aaron <ahao@anyscale.com>1 parent a502e48 commit f9cf81b
19 files changed
Lines changed: 883 additions & 86 deletions
File tree
- skyrl-agent/skyrl_agent/integrations/skyrl_train
- skyrl
- backends
- skyrl_train
- inference_servers
- workers
- fsdp
- megatron
- train
- config
- entrypoints
- generators
- tests
- backends/skyrl_train
- gpu
- gpu_ci
- inference_servers
- inference_servers
- train/generators
Lines changed: 10 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
2 | 5 | | |
3 | 6 | | |
4 | 7 | | |
5 | 8 | | |
6 | 9 | | |
7 | 10 | | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
8 | 16 | | |
9 | 17 | | |
10 | 18 | | |
11 | 19 | | |
12 | 20 | | |
13 | 21 | | |
14 | 22 | | |
15 | | - | |
| 23 | + | |
16 | 24 | | |
17 | 25 | | |
18 | 26 | | |
| |||
21 | 29 | | |
22 | 30 | | |
23 | 31 | | |
24 | | - | |
| 32 | + | |
25 | 33 | | |
26 | 34 | | |
27 | 35 | | |
| |||
Lines changed: 116 additions & 39 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
82 | 85 | | |
83 | 86 | | |
84 | 87 | | |
| |||
192 | 195 | | |
193 | 196 | | |
194 | 197 | | |
195 | | - | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
196 | 208 | | |
197 | 209 | | |
198 | 210 | | |
199 | 211 | | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | 212 | | |
204 | 213 | | |
205 | 214 | | |
| |||
317 | 326 | | |
318 | 327 | | |
319 | 328 | | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
320 | 349 | | |
321 | 350 | | |
322 | 351 | | |
| 352 | + | |
323 | 353 | | |
324 | 354 | | |
325 | 355 | | |
| |||
335 | 365 | | |
336 | 366 | | |
337 | 367 | | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
338 | 371 | | |
339 | 372 | | |
340 | 373 | | |
341 | 374 | | |
| 375 | + | |
342 | 376 | | |
343 | 377 | | |
344 | 378 | | |
| |||
373 | 407 | | |
374 | 408 | | |
375 | 409 | | |
| 410 | + | |
376 | 411 | | |
377 | 412 | | |
378 | 413 | | |
379 | 414 | | |
380 | 415 | | |
381 | 416 | | |
382 | 417 | | |
| 418 | + | |
383 | 419 | | |
384 | 420 | | |
385 | 421 | | |
| |||
407 | 443 | | |
408 | 444 | | |
409 | 445 | | |
| 446 | + | |
410 | 447 | | |
411 | 448 | | |
412 | 449 | | |
| |||
425 | 462 | | |
426 | 463 | | |
427 | 464 | | |
428 | | - | |
429 | | - | |
430 | | - | |
431 | 465 | | |
432 | 466 | | |
433 | | - | |
| 467 | + | |
434 | 468 | | |
435 | 469 | | |
436 | 470 | | |
| |||
466 | 500 | | |
467 | 501 | | |
468 | 502 | | |
| 503 | + | |
469 | 504 | | |
470 | 505 | | |
471 | 506 | | |
| |||
497 | 532 | | |
498 | 533 | | |
499 | 534 | | |
500 | | - | |
501 | 535 | | |
502 | 536 | | |
503 | | - | |
| 537 | + | |
504 | 538 | | |
505 | 539 | | |
506 | 540 | | |
| |||
548 | 582 | | |
549 | 583 | | |
550 | 584 | | |
551 | | - | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
552 | 589 | | |
553 | 590 | | |
554 | 591 | | |
| |||
557 | 594 | | |
558 | 595 | | |
559 | 596 | | |
560 | | - | |
561 | | - | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
562 | 600 | | |
563 | 601 | | |
564 | 602 | | |
565 | 603 | | |
566 | 604 | | |
| 605 | + | |
| 606 | + | |
567 | 607 | | |
568 | 608 | | |
569 | 609 | | |
| |||
581 | 621 | | |
582 | 622 | | |
583 | 623 | | |
584 | | - | |
| 624 | + | |
585 | 625 | | |
586 | 626 | | |
587 | 627 | | |
| |||
596 | 636 | | |
597 | 637 | | |
598 | 638 | | |
599 | | - | |
600 | | - | |
601 | 639 | | |
602 | 640 | | |
603 | | - | |
| 641 | + | |
604 | 642 | | |
605 | 643 | | |
606 | 644 | | |
| |||
679 | 717 | | |
680 | 718 | | |
681 | 719 | | |
682 | | - | |
683 | | - | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
684 | 725 | | |
685 | 726 | | |
686 | 727 | | |
687 | 728 | | |
688 | 729 | | |
| 730 | + | |
689 | 731 | | |
690 | 732 | | |
691 | 733 | | |
| |||
708 | 750 | | |
709 | 751 | | |
710 | 752 | | |
711 | | - | |
712 | | - | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
713 | 757 | | |
714 | 758 | | |
715 | 759 | | |
716 | 760 | | |
717 | 761 | | |
| 762 | + | |
718 | 763 | | |
719 | 764 | | |
720 | 765 | | |
| |||
737 | 782 | | |
738 | 783 | | |
739 | 784 | | |
740 | | - | |
741 | | - | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
742 | 789 | | |
743 | 790 | | |
744 | 791 | | |
745 | 792 | | |
746 | 793 | | |
| 794 | + | |
747 | 795 | | |
748 | 796 | | |
749 | 797 | | |
| |||
1010 | 1058 | | |
1011 | 1059 | | |
1012 | 1060 | | |
1013 | | - | |
| 1061 | + | |
1014 | 1062 | | |
1015 | 1063 | | |
1016 | 1064 | | |
| |||
1095 | 1143 | | |
1096 | 1144 | | |
1097 | 1145 | | |
1098 | | - | |
| 1146 | + | |
1099 | 1147 | | |
| 1148 | + | |
1100 | 1149 | | |
1101 | 1150 | | |
1102 | 1151 | | |
1103 | | - | |
1104 | | - | |
| 1152 | + | |
| 1153 | + | |
1105 | 1154 | | |
1106 | | - | |
1107 | | - | |
| 1155 | + | |
| 1156 | + | |
1108 | 1157 | | |
1109 | | - | |
1110 | | - | |
1111 | | - | |
| 1158 | + | |
| 1159 | + | |
| 1160 | + | |
1112 | 1161 | | |
1113 | 1162 | | |
1114 | 1163 | | |
| |||
1118 | 1167 | | |
1119 | 1168 | | |
1120 | 1169 | | |
1121 | | - | |
1122 | | - | |
1123 | | - | |
1124 | 1170 | | |
| 1171 | + | |
1125 | 1172 | | |
1126 | 1173 | | |
1127 | 1174 | | |
1128 | 1175 | | |
1129 | 1176 | | |
1130 | | - | |
1131 | | - | |
1132 | | - | |
1133 | | - | |
1134 | 1177 | | |
1135 | 1178 | | |
1136 | 1179 | | |
| |||
1148 | 1191 | | |
1149 | 1192 | | |
1150 | 1193 | | |
| 1194 | + | |
| 1195 | + | |
| 1196 | + | |
| 1197 | + | |
| 1198 | + | |
| 1199 | + | |
| 1200 | + | |
| 1201 | + | |
| 1202 | + | |
| 1203 | + | |
| 1204 | + | |
| 1205 | + | |
| 1206 | + | |
| 1207 | + | |
| 1208 | + | |
| 1209 | + | |
| 1210 | + | |
| 1211 | + | |
| 1212 | + | |
| 1213 | + | |
| 1214 | + | |
| 1215 | + | |
| 1216 | + | |
| 1217 | + | |
| 1218 | + | |
| 1219 | + | |
| 1220 | + | |
| 1221 | + | |
| 1222 | + | |
| 1223 | + | |
| 1224 | + | |
| 1225 | + | |
| 1226 | + | |
| 1227 | + | |
1151 | 1228 | | |
1152 | 1229 | | |
1153 | 1230 | | |
| |||
0 commit comments