Skip to content

Latest commit

 

History

History
19 lines (11 loc) · 602 Bytes

File metadata and controls

19 lines (11 loc) · 602 Bytes

Fully Async Training Example

Fully asynchronous (PipelineRL / AReal style) GRPO for Qwen2.5-1.5B-Instruct on GSM8K.

Usage

# prepare the dataset
uv run -- python examples/train/gsm8k/gsm8k_dataset.py --output_dir $HOME/data/gsm8k

export WANDB_API_KEY=<your_key_here>

bash examples/train/fully_async/fully_async_run_gsm8k.sh

For more details, refer to the documentation: https://docs.skyrl.ai/docs/tutorials/fully_async

Especially, refer to the section on what knobs to tune: https://docs.skyrl.ai/docs/tutorials/fully_async#step-2-config-knobs-to-tune-for-fully-async-training