Skip to content

Conversation

@guyueh1
Copy link
Contributor

@guyueh1 guyueh1 commented Oct 31, 2025

What does this PR do ?

Random dataset following specified input and output sequence length

Issues

closes #1302

Usage

Use the following flags for fixed ISL/OSL eval

uv run examples/run_eval_random_dataset.py \
+data.input_len_or_input_len_generator=1000 \
generation.ignore_eos=true \
generation.vllm_cfg.max_model_len=3000

Use the following flags for fixed ISL/OSL GRPO

uv run examples/run_grpo_random_dataset.py \
+data.input_len_or_input_len_generator=1000 \
policy.generation.ignore_eos=true \
policy.generation.output_len_or_output_len_generator=2000 

Use the following flags for random ISL/OSL GRPO with mean + stdv

uv run examples/run_grpo_random_dataset.py \
grpo.val_at_start=false \
grpo.val_period=0 \
policy.max_total_sequence_length=8000 \
+data.input_len_or_input_len_generator.mean=1000 \
+data.input_len_or_input_len_generator.std=100 \
+policy.generation.output_len_or_output_len_generator.mean=2000 \
+policy.generation.output_len_or_output_len_generator.std=1000 \
policy.generation.ignore_eos=True

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

Release Notes

  • New Features

    • Added example scripts for GRPO training and evaluation workflows with support for random datasets
    • Introduced ignore_eos flag in generation configurations for flexible EOS token handling
    • Added output length configuration options for generation control
    • Implemented dummy environment for testing and evaluation scenarios
    • Added FP8 quantization support for MoE (Mixture of Experts) modules
  • Refactor

    • Extended generation configuration to support configurable sequence length generation
    • Enhanced model initialization with improved timing instrumentation for evaluation workflows

@guyueh1 guyueh1 requested review from a team as code owners October 31, 2025 20:04
@guyueh1 guyueh1 changed the title feat: Random dataset with specified input and output sequence length feat: [draft do not merge] Random dataset with specified input and output sequence length Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Synthetic rollout length for GRPO performance benchmarking

1 participant