huggingface · titaenstad · Mar 19, 2025
diff --git a/chapters/en/chapter12/4.mdx b/chapters/en/chapter12/4.mdx
@@ -90,7 +90,7 @@ training_args = GRPOConfig(
     # Essential parameters
     output_dir="output",
     num_train_epochs=3,
-    num_generation=4,  # Number of completions to generate for each prompt
+    num_generations=4,  # Number of completions to generate for each prompt
     per_device_train_batch_size=4,  # We want to get all generations in one device batch
     # Optional but useful
     gradient_accumulation_steps=2,
@@ -101,7 +101,7 @@ training_args = GRPOConfig(
 )
 ```
 
-The `num_generation` parameter is particularly important for GRPO as it defines the group size - how many different completions the model will generate for each prompt. This is a key differentiator from other RL methods:
+The `num_generations` parameter is particularly important for GRPO as it defines the group size - how many different completions the model will generate for each prompt. This is a key differentiator from other RL methods:
 
 - Too small (e.g., 2-3): May not provide enough diversity for meaningful comparisons
 - Recommended (4-16): Provides good balance between diversity and computational efficiency