generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Open
Description
Reproduction
Exact same script ran with zero2 vs plain accelerate leads to measurably worse reward.
compute_environment: LOCAL_MACHINE
debug: false
deepspeed_config:
deepspeed_multinode_launcher: standard
offload_optimizer_device: none
offload_param_device: none
zero3_init_flag: false
zero_stage: 2
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: 'bf16'
num_machines: 1
num_processes: 8
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
System Info
- Platform: Linux-6.8.0-85-generic-x86_64-with-glibc2.35
- Python version: 3.12.9
- TRL version: 0.25.1
- PyTorch version: 2.8.0
- accelerator(s): NVIDIA H200, NVIDIA H200, NVIDIA H200, NVIDIA H200, NVIDIA H200, NVIDIA H200, NVIDIA H200, NVIDIA H200
- Transformers version: 4.57.1
- Accelerate version: 1.12.0
- Accelerate config: not found
- Datasets version: 4.4.1
- HF Hub version: 0.36.0
- bitsandbytes version: 0.48.2
- DeepSpeed version: 0.18.2
- Liger-Kernel version: 0.6.4
- LLM-Blender version: not installed
- OpenAI version: 2.8.1
- PEFT version: 0.18.0
- vLLM version: 0.11.0
Checklist
- I have checked that my issue isn't already filed (see open issues)
- I have included my system information
- Any code provided is minimal, complete, and reproducible (more on MREs)
- Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
- Any traceback provided is complete
Metadata
Metadata
Assignees
Labels
No labels