Skip to content

Commit dae06ce

Browse files
committed
update readme
1 parent ff5cf31 commit dae06ce

2 files changed

Lines changed: 13 additions & 7 deletions

File tree

README.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33
[![GitHub Repo stars](https://img.shields.io/github/stars/hiyouga/EasyR1)](https://github.com/hiyouga/EasyR1/stargazers)
44
[![Twitter](https://img.shields.io/twitter/follow/llamafactory_ai)](https://twitter.com/llamafactory_ai)
55

6+
### Used by [Amazon Web Services](https://aws.amazon.com/cn/blogs/china/building-llm-model-hub-based-on-llamafactory-and-easyr1/)
7+
68
This project is a clean fork of the original [veRL](https://github.com/volcengine/verl) project to support vision language models, we thank all the authors for providing such a high-performance RL training framework.
79

810
EasyR1 is efficient and scalable due to the design of **[HybirdEngine](https://arxiv.org/abs/2409.19256)** and the latest release of **[vLLM](https://github.com/vllm-project/vllm)**'s SPMD mode.
@@ -16,6 +18,7 @@ EasyR1 is efficient and scalable due to the design of **[HybirdEngine](https://a
1618

1719
- Supported algorithms
1820
- GRPO
21+
- DAPO
1922
- Reinforce++
2023
- ReMax
2124
- RLOO
@@ -49,10 +52,10 @@ docker pull hiyouga/verl:ngc-th2.7.0-cu12.6-vllm0.9.1
4952

5053
\* *estimated*
5154

52-
| Method | Bits | 1.5B | 3B | 7B | 32B |
53-
| ------------------------ | ---- | ------ | ------ | ------ | ------- |
54-
| GRPO Full Fine-Tuning | AMP | 2*24GB | 4*40GB | 8*40GB | 16*80GB |
55-
| GRPO Full Fine-Tuning | BF16 | 1*24GB | 1*40GB | 4*40GB | 8*80GB |
55+
| Method | Bits | 1.5B | 3B | 7B | 32B | 72B |
56+
| ------------------------ | ---- | ------ | ------ | ------ | ------- | ------- |
57+
| GRPO Full Fine-Tuning | AMP | 2*24GB | 4*40GB | 8*40GB | 16*80GB | 32*80GB |
58+
| GRPO Full Fine-Tuning | BF16 | 1*24GB | 1*40GB | 4*40GB | 8*80GB | 16*80GB |
5659

5760
> [!NOTE]
5861
> Use `worker.actor.fsdp.torch_dtype=bf16` and `worker.actor.optim.strategy=adamw_bf16` to enable bf16 training.

verl/workers/rollout/vllm_rollout_spmd.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -86,9 +86,8 @@ def __init__(
8686

8787
engine_kwargs = {}
8888
if config.limit_images:
89-
engine_kwargs["limit_mm_per_prompt"] = {"image": config.limit_images}
9089
engine_kwargs["disable_mm_preprocessor_cache"] = True
91-
90+
engine_kwargs["limit_mm_per_prompt"] = {"image": config.limit_images}
9291

9392
self.inference_engine = LLM(
9493
model=model_path,
@@ -217,5 +216,9 @@ def generate_sequences(self, prompts: DataProto) -> DataProto:
217216
},
218217
batch_size=batch_size,
219218
)
220-
non_tensor_batch = {"multi_modal_data": batch_multi_modal_data}
219+
if batch_multi_modal_data is not None:
220+
non_tensor_batch = {"multi_modal_data": batch_multi_modal_data}
221+
else:
222+
non_tensor_batch = {}
223+
221224
return DataProto(batch=batch, non_tensor_batch=non_tensor_batch, meta_info=prompts.meta_info)

0 commit comments

Comments
 (0)