Highlights

Asynchronous training: We now support fully asynchronous training in SkyRL, enabling higher throughput for agentic RL: https://skyrl.readthedocs.io/en/latest/tutorials/fully_async.html

Dependency Upgrades:

Upgraded vLLM to 0.11.0, Ray to 2.51.1
Megatron: Migrated from mbridge to the newer Megatron-Bridge library. The latter is expected to have more active development and support from NVIDIA.

The updated installation instructions can be found here.

Recipes: We've consolidated a list of end-to-end recipes with SkyRL here for reference runs on math, Text2SQL and search tasks.

SkyRL on Managed Platforms: Guides for running SkyRL on managed platforms such as Anyscale, Runpod and SkyPilot can be found here.

Miscellaneous: Support for GPT-OSS, integration with Pytorch's OpenEnv, support for IPv6 clusters, and more!

What's Changed

[Examples][Step wise] Support thinking models like Qwen 3 by @SumanthRH in #468
Modal Integration by @benji-cannot-code in #444
[fix] abort all requests before sleep by @vutrung96 in #458
TerminalBenchGenerator: logprobs + session ID by @li-boxuan in #448
Divide-by-Zero when setting NUMA affinity patch by @matthambrecht in #457
[bug] run linter for t-bench generator by @erictang000 in #476
Bump vLLM version to 0.11.0 by @tyler-griggs in #481
[Sequence parallel][train] Support sequence parallelism without sample packing by @SumanthRH in #480
[fix] Resolve timeout and cleanup issues in GPU CI pipeline by @tyler-griggs in #483
Increase timeout for GPU CI by @tyler-griggs in #485
Skypilot: Update Doc by @lynnliu030 in #484
Fix GPU CI Test Failures: Migrating Tests, NCCL P2P Access Errors, and Test Fixture Issues by @devpatelio in #477
[Fix] Fix entropy calculation without sample packing by @SumanthRH in #490
Skypilot: Multi-Node Test by @lynnliu030 in #493
Support exporting environment-specific metrics by @vibha-ctrl in #386
Fix broken import by @tyler-griggs in #500
Revert "Bump vLLM version to 0.11.0" by @erictang000 in #501
Fix broken entropy metric by @tyler-griggs in #504
[fix] Resolve double ray.init() call by @tyler-griggs in #506
[lora] fix lora with vllm offline engine by @erictang000 in #513
Increase GPU CI Timeout to Pass All Tests by @devpatelio in #512
[train] Increase default timeout for placement groups to 180s by @SumanthRH in #525
[dependencies] fix some flash-rl dependency issues by @erictang000 in #530
Add implementation of CISPO loss by @vutrung96 in #523
[skyrl-train] assert that the policy loss type is regular/dual clip for tis by @erictang000 in #546
[Fix] Fix fsdp2_load_state_dict with HSDP by @SumanthRH in #554
[skyrl-train] update defaults for CISPO by @erictang000 in #553
[GPTOSS] Integrate Unsloth's flex attention implementation for attention sink by @SumanthRH in #515
[skyrl-train][logging] rename loss/avg_raw_rewards to loss/avg_final_rewards for clarity by @erictang000 in #544
[Integrations] Support PyTorch OpenEnv by @lynnliu030 in #543
[Docs] Fix image in OpenEnv doc by @SumanthRH in #562
Remove truncation logic, fix corresponding tests by @devpatelio in #508
[megatron][bug fix] reset dist checkpointing asynccallsqueue to allow freeing memory by @erictang000 in #565
[dependencies] separate vllm + megatron + bump vllm back to 0.11.0 + pin minimum uv version for extra-build-dependencies by @erictang000 in #528
[skyrl-train] Enable Inference Engine pipeline parallelism by @pandyamarut in #555
[fix] Broken method call in test by @tyler-griggs in #571
[AsyncRL][1/N] Add abort_generation to vllm engine and pause/continue generation to client by @CharlieFRuan in #537
Update README.md about SkyRL-v0 reproduction by @caoshiyi in #573
[AsyncRL][2/N] Implement /chat/completion with retry on aborted sub requests by @CharlieFRuan in #557
[train][Logging] Set loguru default to INFO, and customizable by LOG_LEVEL by @CharlieFRuan in #578
[skyrl-train][Fix] Fix epoch counter after resuming from checkpoint by @SumanthRH in #589
[skyrl-train] Enforce eager by default by @SumanthRH in #569
[skyrl-train][Fix] sleep only if colocated by @SumanthRH in #595
Fix: Megatron Autograd Warning for Broadcast Kernel by @devpatelio in #588
Comment by @devpatelio in #596
Comment upda by @devpatelio in #597
Cleanup stray doc by @SumanthRH in #599
[skyrl-train] Make libnuma optional for training by @SumanthRH in #601
[skyrl-train][Examples] Support truncated importance sampling for StepWiseGenerator by @SumanthRH in #570
Add YaRN support for VLLM and HF by @sergeypastukhov-ddog in #561
[Docs] Refactor documentation for running SkyRL on managed platforms by @SumanthRH in #608
[train] Remove train_batch_size from fsdp/deepspeed strategy by @CharlieFRuan in #617
[skyrl-train] add option to specify ref model path by @erictang000 in #623
[skyrl-train] Add DAPO 7B recipe, and 32B training script by @erictang000 in #532
[skyrl-train][recipes] add dapo qwen3 1.7b and 4b scripts by @erictang000 in #625
Fix table formatting in DAPO README by @erictang000 in #631
[train][utils] Aggregate rollout metrics and validate output in concat GeneratorOutput by @CharlieFRuan in #620
[skyrl-train] Add example for on-policy distillation by @erictang000 in #585
Support IPv6 addresses in TCP URL construction by @mayavkrishnan25 in #612
[train][TBench] Cherrypick Terminus integration and use Harbor by @CharlieFRuan in #637
[megatron] Added non cuda ipc wt sync to megatron workers by @nikhilbarhate99 in #635
[docs] Add build instructions to README.md by @CharlieFRuan in #648
Fix in README.md by @nrghosh in #653
[skyrl-train][Fix] Fix FSDP1 module wrap policy for HFModelWrapper by @SumanthRH in #654
Return init_prompts in generate_batched by @ebronstein in #652
[Docs] Fix model placement docs by @SumanthRH in #663
[skyrl-train] Support older vllm versions till 0.9.2 by @SumanthRH in #671
[lora] enforce_eager=true slows down generation time dramatically with LoRA by @devpatelio in #665
Conditionally add the generation prompt to the multi-turn chat template by @ebronstein in #676
Add entropy loss by @pbokc in #622
[skyrl-train] Upgrade Ray to 2.51.1 by @SumanthRH in #633
[Docs] Add a recipes page consolidating all E2E recipes by @SumanthRH in #679
[skyrl-train][docs] Add commit for dapo to recipes and add megatron search-r1 results by @erictang000 in #689
[megatron] upgrade from mbridge -> Megatron-Bridge (breaking change) by @erictang000 in #453
[update] Updated RoPE Configuration for HF Models (transformers) w. backward-compatible support for vLLM by @devpatelio in #690
Revert "[skyrl-train] Updated RoPE Configuration for HF Models transformers) w. backward-compatible support for vLLM (#690)" by @SumanthRH in #695
[skyrl-train][megatron] Remove use of PYTHONPATH for getting around transformer-engine installation by @erictang000 in #697
[megatron] improving weight syncing - bucketed param gather + cuda ipc flattening by @erictang000 in #487
[megatron] separate offloading gradients from offloading params for megatron by @erictang000 in #563
Update trainer docstrings that values has shape batch_size x seqlen by @ebronstein in #687
[skyrl-train][step-wise] 1/N - Support step-wise training with step_wise_training flag by @SumanthRH in #694
Revert "[skyrl-train][step-wise] 1/N - Support step-wise training with step_wise_training flag" by @CharlieFRuan in #706
[AsyncRL][3/N] Support fully async training for any generator by @CharlieFRuan in #579
[AsyncRL][4/N] Support in-flight weight update for generate() by @CharlieFRuan in #656
[train][TBench][MiniSwe] Fix custom generator loss masking by @CharlieFRuan in #710
[skyrl-train] fix docs link to async off by one by @erictang000 in #712
SkyRL-Agent Release Part 1 by @caoshiyi in #713

New Contributors

@vutrung96 made their first contribution in #458
@li-boxuan made their first contribution in #448
@matthambrecht made their first contribution in #457
@cwwarren made their first contribution in #466
@vibha-ctrl made their first contribution in #386
@atemaguer made their first contribution in #486
@pandyamarut made their first contribution in #555
@caoshiyi made their first contribution in #573
@pbokc made their first contribution in #586
@mayavkrishnan25 made their first contribution in #612
@nikhilbarhate99 made their first contribution in #635
@hezyin made their first contribution in #636
@nrghosh made their first contribution in #653
@taroigarashi2001 made their first contribution in #681

Full Changelog: skyrl_train-v0.2.0...skyrl_train-v0.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SkyRL-Train: v0.3.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

What's Changed

New Contributors

Contributors

Uh oh!