Skip to content

Add stop string support to rl.#3263

Merged
copybara-service[bot] merged 1 commit intomainfrom
nicogrande/add-stop-strings
Mar 5, 2026
Merged

Add stop string support to rl.#3263
copybara-service[bot] merged 1 commit intomainfrom
nicogrande/add-stop-strings

Conversation

@NicoGrande
Copy link
Collaborator

Description

Introduces support for stop string, which tell vLLM to stop generating as soon as any of the tokens included in this field are generated. This uses the new rollout_vllm_sampler_kwargs argument introduced here: google/tunix#1169

Tests

Command:

NEW_MODEL_DESIGN=1 TPU_BACKEND_TYPE=jax python3 -m src.maxtext.trainers.post_train.rl.train_rl src/maxtext/configs/post_train/rl.yml \
  model_name=gemma3-4b \
  tokenizer_path=google/gemma-3-4b-it \
  run_name=$WORKLOAD \
  base_output_directory=$OUTPUT_PATH \
  hf_access_token=$HF_TOKEN \
  batch_size=4 \
  num_batches=5 \
  scan_layers=False \
  hbm_utilization_vllm=0.4 \
  rollout_data_parallelism=2 \
  rollout_tensor_parallelism=2 \
  allow_split_physical_axes=true \
  load_parameters_path=$CHECKPOINT_PATH \
  vllm_hf_overrides='{architectures: ["MaxTextForCausalLM"]}' \
  vllm_additional_config='{"maxtext_config": {"model_name": "gemma3-4b", "log_config": "false"}}' 2>&1 | tee  grpo_out_gemma3.txt

Logs

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@NicoGrande NicoGrande force-pushed the nicogrande/add-stop-strings branch 2 times, most recently from 38cf343 to 8498485 Compare February 27, 2026 01:44
@codecov
Copy link

codecov bot commented Feb 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@NicoGrande NicoGrande force-pushed the nicogrande/add-stop-strings branch from 1fe03cb to 841ee9f Compare March 4, 2026 18:54
@copybara-service copybara-service bot merged commit 669d702 into main Mar 5, 2026
42 checks passed
@copybara-service copybara-service bot deleted the nicogrande/add-stop-strings branch March 5, 2026 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants