Release SkyRL: v0.1.0 · NovaSky-AI/SkyRL

Highlights

New unified package : This is the first release of the unified skyrl package combining the skyrl-train and skyrl-tx packages. The unified package brings together the FSDP, Megatron and Jax backend under the Tinker API, while still retaining user-facing "frontend" interfaces (Ex: BasePPOTrainer, SkyRLGymGenerator, etc) from the skyrl-train and skyrl-tx packages. For details about the migration, please refer to #1145.

Improved API documentation: We've revamped the API documentation pages for SkyRL The new API documentation pages can be found here: https://docs.skyrl.ai/api-ref

Pythonic Configs: The skyrl-train backend has now fully migrated to using pythonic dataclasses, replacing the older YAML based interface. The configuration hierarchy has also been updated, and the CLI no longer relies on Hydra. Please refer to the documentation for the new configuration hierarchy: https://docs.skyrl.ai/docs/api-ref/skyrl/config

SGLang is no longer supported: SkyRL no longer supports the SGLang inference engine, unifying on vLLM.

vllm 0.16.0 upgrade: This release updates vLLM to 0.16.0

Qwen 3.5 experimental support: SkyRL now has experimental support for Qwen 3.5 models. This is currently limited to the Jax backend.

What's Changed

[tx] Load test weights from HF cache instead of save_pretrained by @raulchen in #1095
[tx] Fix stacked sharding for flax >= 0.12.4 by @pcmoritz in #1120
[tx] Stack weights — Llama3 by @raulchen in #1081
[Harbor] Rename TerminalBenchGenerator -> HarborGenerator by @CharlieFRuan in #1122
[tx] Stack weights — DeepSeek by @raulchen in #1082
[Harbor][Docs] Add Harbor docs and make data prepartion streamlined by @CharlieFRuan in #1124
[tx] Per-layer gradient checkpointing by @raulchen in #1083
[tx] Make apply_lora work with flattened states by @tanmaysachan in #1111
Port #1079 to skyrl folder by @pcmoritz in #1127
Fix linting on main branch by @pcmoritz in #1128
Port #1095 to skyrl folder by @pcmoritz in #1129
Port #1120 to skyrl folder by @pcmoritz in #1130
[migration] copy old docs, examples, integrations, scripts by @erictang000 in #1133
[1/N] Remove skyrl folder on main by @erictang000 in #1138
[2/N] Add back skyrl code to top level and delete skyrl-train and skyrl-tx to get git history by @erictang000 in #1139
[3/N] Add back skyrl-train and skyrl-tx code by @erictang000 in #1140
[4/N] Fix CI paths for new skyrl code now that we have removed one level of nesting by @erictang000 in #1141
fix linting error in skyrl-cpu CI (ignore agent and examples) and change CI scope by @erictang000 in #1143
Update README with repo re-organization notice by @CharlieFRuan in #1146
Update README about reorganizing by @CharlieFRuan in #1147
[Reorg] Update examples/train scripts to use new skyrl package by @CharlieFRuan in #1148
[Harbor] Make the data preparation scripts simpler and use examples/train as main entrypoint by @CharlieFRuan in #1150
[Harbor] Move examples/train/harbor to examples/train_integrations/harbor by @CharlieFRuan in #1151
Redirect to new Harbor integration documentation by @CharlieFRuan in #1152
[train][logs] Separate infrastructure logs from training progress by @CharlieFRuan in #1088
[train][Config] Make trainer.algorithm.max_seq_len configurable, for seq_mean_token_sum_norm by @CharlieFRuan in #1153
[Harbor] Add overlong filtering support and add Dr. GRPO params to script by @CharlieFRuan in #1157
[Harbor] Expose configs override_timeout_sec and env-specific auto stop by @CharlieFRuan in #1158
[tx] Pass through the loss function config by @pcmoritz in #1155
Port #1081 to skyrl folder by @pcmoritz in #1131
Port #1082 to skyrl folder by @pcmoritz in #1161
Port #1083 to skyrl folder by @pcmoritz in #1162
Port #1111 to skyrl folder by @pcmoritz in #1164
[tx] Fix loss function config keys and add validation by @pcmoritz in #1159
Add CISPO loss function to skyrl-tx by @tamoghnokandar in #1144
[Harbor] Move rate limit out of TrialConfig to +generator.rate_limit by @CharlieFRuan in #1165
[Harbor][Doc] Update Harbor docs by @CharlieFRuan in #1163
[Docs] Small fix on doc linking by @CharlieFRuan in #1166
Port #1155 to skyrl folder by @pcmoritz in #1167
Port #1159 to skyrl folder by @pcmoritz in #1168
Port #1144 to skyrl folder by @pcmoritz in #1169
Update readme for blogposts by @CharlieFRuan in #1171
Finish skyrl-tx migration to skyrl folder by @pcmoritz in #1170
[tx] Add metrics to optim_step by @pcmoritz in #1142
[train] Support logprobs, fix generation config defaults and add more generation tests for the new HTTP inference pathway by @SumanthRH in #1038
[Harbor] Bump harbor version by @CharlieFRuan in #1174
[ci][megatron] fix megatron CI by including numpy in override-dependencies by @erictang000 in #1175
[ci] Migrate nightly e2e CI tests to run against skyrl directory by @erictang000 in #1176
[deps] remove numpy from override dependencies by @erictang000 in #1178
[docs] Add API reference to new docs site and cleanup old docs by @SumanthRH in #1177
[CI][Megatron] Use ray_init_fixture in test_megatron_policy_weight_sync by @SumanthRH in #1183
[Docs] Fix Runpod example to use the new skyrl package by @SumanthRH in #1185
[tests] Fix template path in test_inference_engine_http_endpoint by @SumanthRH in #1186
[tests] Fix remote server startup command and sleep level for engine generation tests by @SumanthRH in #1190
[tx] Introduce optimization step metrics dataclass by @pcmoritz in #1191
[Docs] Fix API reference docs for SkyRLTrain backend by @SumanthRH in #1192
[Harbor] Replace asyncio.gather with TaskGroup for proper cancellation by @CharlieFRuan in #1193
[tx] General implementation of trainable Hyper Connections by @tanmaysachan in #1008
Port #1191 to skyrl folder by @pcmoritz in #1197
[train] Add explicit finish calls for tracker when training ends by @SumanthRH in #1198
[train][CI] Add regression thresholds for E2E CI runs by @SumanthRH in #1199
[train][tests] Fix hanging test test_abort_generation_vllm_engine by @SumanthRH in #1202
[tests][train] Fix repeated arg in test_lora and cleanup logic in test_skyrl_gym_generator by @SumanthRH in #1209
[train] Pythonic Configs 2/N: Switch to new dataclasses in entrypoint scripts; Change instantiation signatures by @SumanthRH in #1187
[tx] Port #1008 to skyrl folder by @pcmoritz in #1217
[tx] Update README.md by @pcmoritz in #1223
[tx] Fix checkpoint loading for MoE models by @pcmoritz in #1224
[Docs] Migrate to Griffe2md for fumadocs-compatible API reference by @SumanthRH in #1225
[tx] Fix engine command on TPUs by @pcmoritz in #1181
[train] Remove deprecated SGLang backend from skyrl/ by @SumanthRH in #1220
Deepcopy env_extras dicts for each sample when preparing generator input by @ebronstein in #1189
[tx] Update README.md and move tx over to skyrl/ folder by @pcmoritz in #1226
[CI] Fix failing docs build; Fix GPU OOM for new inference codepath; Improve post-processing in Search env by @SumanthRH in #1221
[Docs] Add source code blocks for methods by @SumanthRH in #1229
[train] Fix issue with unset pad_token_id by @SumanthRH in #1232
[train][tests] Add tests for RemoteInferenceClient by @SumanthRH in #1211
[agent] Migrate skyrl-agent to use the new skyrl package by @SumanthRH in #1235
[agent][Fix] Fix SkyRLAgentPPOTrainer after switch to async by @SumanthRH in #1237
[tx] Fix tied embedding closure for TPUs by @pcmoritz in #1233
Add Daytona to acknowledgements in README by @CharlieFRuan in #1239
Fix Megatron backend correctness: grad_scale_func, PP seed variation, weight sync pause by @tyler-griggs in #1212
Add configurable MoE runtime flags to MegatronConfig by @tyler-griggs in #1213
Register GLM-4.7-Flash bridge and bump megatron-bridge by @tyler-griggs in #1214
Bump vLLM to 0.16.0 with required dep updates by @tyler-griggs in #1240
Add GLM-4.7-Flash (30B MoE) training example by @tyler-griggs in #1215
[train] Cleanup docs references to skyrl-train by @SumanthRH in #1236
[train][examples] Fix 8 broken example scripts from skyrl-train migration by @CharlieFRuan in #1230
[train][docs] Fix stale test docstrings and remaining docs references from skyrl-train migration by @CharlieFRuan in #1246
[deps] bump megatron core to 0.16.0 by @erictang000 in #1247
[train] Fix maximum context length handling and generation tests after 0.16.0 upgrade by @SumanthRH in #1248
[train][fullyAsync] Fix abort/pause broken after vllm 0.16.0 bump by @CharlieFRuan in #1250
[train][docs] Remove old docs in skyrl/train/old_docs by @SumanthRH in #1253
[train] Remove deprecated skyrl-train package by @SumanthRH in #1249
[tx] Implement Qwen 3.5 model architecture by @pcmoritz in #1228
[train][fix] Temporarily rename /update_weights endpoint to prevent conflict with native vllm endpoint by @SumanthRH in #1265
[megatron] refactor megatron param/grad offload to directly use new megatron built in functions by @erictang000 in #1266
[megatron] replace deprecated megatron checkpointing default, set "distrib_optim_sharding_type" to "dp_reshardable" by @erictang000 in #1268
[docs] Fix examples/ paths in the docs after skyrl-train -> skyrl migration by @SumanthRH in #1269
[train] Fix There is no current event loop in thread 'MainThread'. by @SumanthRH in #1270
[train] Add per-token hard masking for off-policy correction by @tyler-griggs in #1264
[tx] replace calls to jax.process_index() to resolve rank ordering issue with multi-host TPUs by @andrewsykim in #1252
[ci] remove incorrect key check in megatron ci test by @erictang000 in #1275
[train] Make env vars file common by @SumanthRH in #1276
[tinker] Propagate all uv run args from server startup command to backend workers by @SumanthRH in #1255
[tx] Implement chunked delta rule for DeltaNet by @pcmoritz in #1272

New Contributors

@tamoghnokandar made their first contribution in #1144
@andrewsykim made their first contribution in #1252

Full Changelog: skyrl_train-v0.4.0...skyrl-v0.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SkyRL: v0.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

What's Changed

New Contributors

Contributors

Uh oh!