Update omr receipe by wedu-nvidia · Pull Request #973 · NVIDIA-NeMo/Skills

wedu-nvidia · 2025-10-18T03:08:44Z

#963

Summary by CodeRabbit

New Features
- Added a --backend CLI option (default: megatron) to run training and evaluation on one or multiple backends.
- Pipeline now runs per-backend training and per-backend initial/final evaluations, producing backend-specific models and result outputs.
Tests
- Test and verification flows updated to iterate backends, pass the backend option through, and validate backend-specific outputs.

Signed-off-by: Wei Du <wedu@nvidia.com>

coderabbitai · 2025-10-18T03:08:54Z

Walkthrough

A new multi-backend parameter was added and propagated through the OpenMath reasoning recipe: CLI --backend (default ["megatron"]) is threaded into prepare, initial_eval, run_sdg, run_training, and final_eval; training and evaluation now iterate per-backend. Test runner and result checker were updated to accept and use backend-specific paths.

Changes

Cohort / File(s)	Summary
Recipe script: backend plumbing & per-backend execution `recipes/openmathreasoning/scripts/simplified_recipe.py`	Added `--backend` CLI (default `["megatron"]`); updated signatures for `prepare`, `initial_eval`, `run_sdg`, `run_training`, and `final_eval` to accept `backend`; threaded backend through the main flow; changed `run_training`, `initial_eval`, and `final_eval` to iterate per-backend and construct backend-specific model/output paths (e.g., `training-{backend}`, `evals/after-training-{backend}`); backend-aware wandb handling and conditional tweaks (e.g., sequence packing adjustments for `fsdp`) applied where relevant.
Test runner: forward backend to recipe `tests/slurm-tests/omr_simple_recipe/run_test.py`	Added `--backend` CLI arg (default `["megatron"]`); propagate `args.backend` into the command that invokes the simplified_recipe script and into the `check_results` invocation; added `reuse_code=False` to `run_cmd` call.
Result checker: backend-aware evaluation `tests/slurm-tests/omr_simple_recipe/check_results.py`	Updated `check_results` signature to accept `backend: str`; CLI now accepts multi `--backend` and iterates each backend and benchmark, loading backend-specific after-training result paths (`after-training-<backend>/eval-results/...`) and calling `check_results` with the backend identifier; assertion messages and path construction updated to include backend.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI
  participant Recipe as simplified_recipe
  participant Prepare
  participant InitialEval
  participant SDG as run_sdg
  participant Train as run_training
  participant FinalEval

  CLI->>Recipe: invoke with --backend [b1,b2,...]
  loop for each backend b
    Recipe->>Prepare: prepare(..., backend=b)
    Recipe->>InitialEval: initial_eval(..., backend=b)
    Recipe->>SDG: run_sdg(..., backend=b)
    Recipe->>Train: run_training(..., backend=b)
    Train-->>Recipe: model_path(b)
    Recipe->>FinalEval: final_eval(..., backend=b, model=model_path(b))
  end
  Note right of Recipe #DDDDFF: per-backend artifacts and outputs created

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I hopped through code with a twitch of my nose,

Backends in a basket, in neat little rows,
Megatron or fsdp, each gets its own run,
Prepared, trained, checked — one by one, then we're done! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title Check	❓ Inconclusive	The title "Update omr receipe" uses a non-descriptive term ("Update") that lacks specificity about the changeset. While the title does refer to the omr recipe component being modified, it fails to convey the main purpose of the changes, which center on adding multi-backend support and backend parameter threading throughout the codebase. A teammate scanning the history would not understand that this PR introduces backend-specific training iterations, evaluation flows, and CLI arguments. Additionally, the title contains a typo ("receipe" instead of "recipe").	Consider revising the title to be more descriptive and specific, such as "Add multi-backend support to omr recipe" or "Introduce backend parameter and per-backend training iterations". This would clearly communicate the main change to reviewers. Also correct the typo to "recipe".

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch wedu/omr-receipe

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

wedu-nvidia · 2025-10-18T03:09:35Z

@Kipok I tested the full loop and the results are still in the range, so no need to update the range.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (9)

recipes/openmathreasoning/scripts/simplified_recipe.py (7)

27-47: Silence ARG001 by prefixing unused params in prepare.

num_gpus, backend, wandb_params aren’t used here. Keep the uniform signature but underscore them.

-def prepare(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):
+def prepare(workspace, cluster, _num_gpus, expname_prefix, _backend, _wandb_params):

32-40: Pin or parameterize download ref to avoid HEAD drift.

Fetching from refs/heads/main is brittle in CI. Allow overriding ref or pin to a commit/tag.

-        f"export DOWNLOAD_PREFIX=https://raw.githubusercontent.com/NVIDIA-NeMo/Skills/refs/heads/main/recipes/openmathreasoning && "
+        f"export DOWNLOAD_REF=${{DOWNLOAD_REF:-refs/heads/main}} && "
+        f"export DOWNLOAD_PREFIX=https://raw.githubusercontent.com/NVIDIA-NeMo/Skills/${{DOWNLOAD_REF}}/recipes/openmathreasoning && "

49-90: Underscore unused backend in run_sdg to appease Ruff.

-def run_sdg(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):
+def run_sdg(workspace, cluster, num_gpus, expname_prefix, _backend, wandb_params):

143-159: Underscore unused backend in final_eval.

-def final_eval(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):
+def final_eval(workspace, cluster, num_gpus, expname_prefix, _backend, wandb_params):

161-176: Underscore unused backend in initial_eval.

-def initial_eval(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):
+def initial_eval(workspace, cluster, num_gpus, expname_prefix, _backend, wandb_params):

211-216: Validate backend via argparse choices.

-    parser.add_argument(
-        "--backend",
-        type=str,
-        default="megatron",
-        help="Can either be megatron or fsdp",
-    )
+    parser.add_argument(
+        "--backend",
+        type=str,
+        choices=("megatron", "fsdp"),
+        default="megatron",
+        help="Training backend",
+    )

63-63: Wrap run_after values in lists to match type hints.

The run_after parameter in generate() and sft_nemo_rl() is typed as List[str], but the current code passes bare strings. While the runtime includes defensive isinstance checks to convert strings to lists, following the type hints improves consistency and type-checking compatibility.

-        run_after=f"{expname_prefix}-download-assets",
+        run_after=[f"{expname_prefix}-download-assets"],
@@
-        run_after=f"{expname_prefix}-problem-extraction",
+        run_after=[f"{expname_prefix}-problem-extraction"],
@@
-        run_after=f"{expname_prefix}-solution-generation",
+        run_after=[f"{expname_prefix}-solution-generation"],
@@
-        run_after=f"{expname_prefix}-prepare-training-data",
+        run_after=[f"{expname_prefix}-prepare-training-data"],
@@
-        run_after=f"{expname_prefix}-training",
+        run_after=[f"{expname_prefix}-training"],
@@
-        run_after=f"{expname_prefix}-download-assets",
+        run_after=[f"{expname_prefix}-download-assets"],

Applies to lines: 63, 81, 108, 138, 155, 173

tests/slurm-tests/omr_simple_recipe/run_test.py (2)

27-27: Validate backend choices in test CLI for early failure.

-    ap.add_argument("--backend", type=str, default="megatron", help="Can either be megatron or fsdp")
+    ap.add_argument("--backend", type=str, choices=("megatron", "fsdp"), default="megatron", help="Training backend")

30-44: Avoid shell=True; pass argv list to subprocess.run.

Safer and avoids quoting bugs.

-    cmd = (
-        f"python -m recipes.openmathreasoning.scripts.simplified_recipe "
-        f" --cluster {args.cluster} "
-        f" --workspace {args.workspace} "
-        f" --expname_prefix {args.expname_prefix} "
-        f" --backend {args.backend} "
-    )
-
-    if args.disable_wandb:
-        cmd += " --disable_wandb "
-    elif args.wandb_project:
-        cmd += f" --wandb_project {args.wandb_project} "
-
-    subprocess.run(cmd, shell=True, check=True)
+    cmd = [
+        "python", "-m", "recipes.openmathreasoning.scripts.simplified_recipe",
+        "--cluster", args.cluster,
+        "--workspace", args.workspace,
+        "--expname_prefix", args.expname_prefix,
+        "--backend", args.backend,
+    ]
+    if args.disable_wandb:
+        cmd.append("--disable_wandb")
+    elif args.wandb_project:
+        cmd += ["--wandb_project", args.wandb_project]
+
+    subprocess.run(cmd, check=True)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b18f39 and d694d41.

📒 Files selected for processing (2)

recipes/openmathreasoning/scripts/simplified_recipe.py (8 hunks)
tests/slurm-tests/omr_simple_recipe/run_test.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

recipes/openmathreasoning/scripts/simplified_recipe.py (2)

nemo_skills/pipeline/nemo_rl/sft.py (1)

sft_nemo_rl (214-498)

nemo_skills/pipeline/cli.py (1)

wrap_arguments (42-51)

🪛 Ruff (0.14.0)

recipes/openmathreasoning/scripts/simplified_recipe.py

27-27: Unused function argument: num_gpus

(ARG001)

27-27: Unused function argument: backend

(ARG001)

27-27: Unused function argument: wandb_params

(ARG001)

49-49: Unused function argument: backend

(ARG001)

143-143: Unused function argument: backend

(ARG001)

161-161: Unused function argument: backend

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: pre-commit
GitHub Check: unit-tests

🔇 Additional comments (3)

recipes/openmathreasoning/scripts/simplified_recipe.py (3)

126-140: Good: backend plumbed into sft_nemo_rl and ctx assembled cleanly.

114-125: This review comment is based on a misunderstanding and should be disregarded.

NeMo RL explicitly supports "PyTorch FSDP2, TP, CP, and SP for efficient training", and supports "performant parallelisms with Megatron Core (TP/PP/CP/SP/EP/FSDP)". The FSDP2Manager flattens "data and context dimensions if context parallelism is enabled", confirming that FSDP and CP > 1 are compatible.

The actual incompatibility is between context parallelism and sequence packing (due to conflicting attention implementations), not between context parallelism and FSDP. Since the code already disables sequence_packing for FSDP (line 124), keeping context_parallel_size=2 for all backends is correct and does not cause issues.

No changes are needed.

126-140: Verify that cluster config includes HF_HOME in env_vars before running this recipe.

The review comment is valid. The sft_nemo_rl function validates HF_HOME when backend is "megatron", and this recipe defaults to the megatron backend (line 213: default="megatron"). If HF_HOME is missing from env_variables, sft_nemo_rl raises a RuntimeError directing users to add HF_HOME=/your/path/to/hf_home to the cluster config's env_vars. The validation also checks that the HF_HOME path is mounted.

Developers must ensure their cluster config (passed to run_training() via the cluster parameter) includes HF_HOME properly configured to avoid runtime failure when sft_nemo_rl() is called. This is a configuration requirement, not a code defect.

Kipok

so the current version repeats the whole script for different backend. This is how it was before, but it's inefficient because we don't need to repeat baseline evals / generation two times as they are identical. Could you please change it such that instead it uses backend as a list, so by default it's running only megatron, but we can configure it to run both megatron and fsdp inside the same job in parallel and then do two final evals for both of them. But the initial steps will be shared

wedu-nvidia · 2025-10-20T00:55:00Z

OK, that's a better idea and I will make changes on this.

Signed-off-by: Wei Du <wedu@nvidia.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

tests/slurm-tests/omr_simple_recipe/check_results.py (1)

56-75: Avoid reloading baseline per-backend; hoist and use f-strings.

Load baseline_results once per benchmark and switch to clearer f-strings.

-    for training_backend in args.backend:
-        for benchmark in ("aime24", "aime25"):
-            common_path = Path(args.workspace) / "evals"
-            baseline_results = load_json(common_path / "baseline" / "eval-results" / benchmark / "metrics.json")
-            after_training_results = load_json(
-                common_path
-                / "after-training-{}".format(training_backend)
-                / "eval-results"
-                / benchmark
-                / "metrics.json"
-            )
-            check_results(benchmark, baseline_results, after_training_results, training_backend)
+    common_path = Path(args.workspace) / "evals"
+    for benchmark in ("aime24", "aime25"):
+        baseline_results = load_json(common_path / "baseline" / "eval-results" / benchmark / "metrics.json")
+        for training_backend in args.backend:
+            after_training_results = load_json(
+                common_path
+                / f"after-training-{training_backend}"
+                / "eval-results"
+                / benchmark
+                / "metrics.json"
+            )
+            check_results(benchmark, baseline_results, after_training_results, training_backend)

recipes/openmathreasoning/scripts/simplified_recipe.py (2)

27-31: Silence Ruff ARG001 for intentionally unused params.

These functions accept backend/num_gpus/wandb_params to keep a uniform call signature but don’t use them. Add # noqa: ARG001 on the def lines to keep the API without linter noise.
-def prepare(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):
+def prepare(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):  # noqa: ARG001
@@
-def run_sdg(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):
+def run_sdg(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):  # noqa: ARG001
@@
-def initial_eval(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):
+def initial_eval(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):  # noqa: ARG001
Also applies to: 49-55, 162-177

27-41: Stale comment: this step is wrapped in run_cmd.

Line 28 says “not wrapping with run_cmd”, but below you do call run_cmd. Update the comment to avoid confusion.
-    # data preparation needs to run locally without container, so not wrapping with run_cmd
+    # Data preparation assets are fetched via run_cmd (non-containerized script execution).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d694d41 and c540577.

📒 Files selected for processing (3)

recipes/openmathreasoning/scripts/simplified_recipe.py (6 hunks)
tests/slurm-tests/omr_simple_recipe/check_results.py (2 hunks)
tests/slurm-tests/omr_simple_recipe/run_test.py (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/slurm-tests/omr_simple_recipe/run_test.py

🧰 Additional context used

🧬 Code graph analysis (2)

tests/slurm-tests/omr_simple_recipe/check_results.py (2)

tests/slurm-tests/omr_simple_recipe/run_test.py (1)

main (20-63)

tests/slurm-tests/utils.py (1)

load_json (18-21)

recipes/openmathreasoning/scripts/simplified_recipe.py (2)

nemo_skills/pipeline/nemo_rl/sft.py (1)

sft_nemo_rl (214-498)

nemo_skills/pipeline/cli.py (1)

wrap_arguments (42-51)

🪛 Ruff (0.14.1)

recipes/openmathreasoning/scripts/simplified_recipe.py

27-27: Unused function argument: num_gpus

(ARG001)

27-27: Unused function argument: backend

(ARG001)

27-27: Unused function argument: wandb_params

(ARG001)

49-49: Unused function argument: backend

(ARG001)

162-162: Unused function argument: backend

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: unit-tests

🔇 Additional comments (1)

tests/slurm-tests/omr_simple_recipe/check_results.py (1)

35-51: Backend-aware assertion context looks good.

Including the backend in the after_training assertion message improves debuggability. No functional concerns.

coderabbitai · 2025-10-20T20:37:11Z

+        eval(
+            ctx=wrap_arguments("++inference.tokens_to_generate=16384 "),
+            cluster=cluster,
+            model=f"{workspace}/training/qwen2.5-14b-improved-hf-{training_backend}",
+            server_type="vllm",
+            server_gpus=num_gpus,
+            benchmarks="aime24:8,aime25:8",
+            output_dir=f"{workspace}/evals/after-training-{training_backend}",
+            num_jobs=1,
+            expname=f"{expname_prefix}-final-eval",
+            run_after=f"{expname_prefix}-training-{training_backend}",
+            wandb_name=f"{expname_prefix}-final-eval" if not wandb_params["disable_wandb"] else None,
+            wandb_project=wandb_params["wandb_project"],
+        )


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Potential duplicate expname across backends in final_eval. Please verify.

Multiple eval(...) calls use the same expname=f"{expname_prefix}-final-eval". If the scheduler requires unique experiment names, this can collide or break run_after dependencies.

Option if uniqueness is required:

- expname=f"{expname_prefix}-final-eval", + expname=f"{expname_prefix}-final-eval-{training_backend}",

Run this check to see how eval constructs task names and whether expname must be unique:

🏁 Script executed:

#!/bin/bash # Inspect eval entrypoint/task naming to assess expname uniqueness requirements rg -nP -C3 '\bdef\s+eval\b|\badd_task\(|task_name=|expname=' nemo_skills/pipeline -g '!**/tests/**'

Length of output: 22975

🏁 Script executed:

#!/bin/bash # Inspect get_exp function to understand expname uniqueness requirements rg -nP -A20 'def get_exp' nemo_skills/pipeline/utils/exp.py

Length of output: 1989

🏁 Script executed:

#!/bin/bash # Check how _reuse_exp parameter is used in the pipeline rg -nP '_reuse_exp|reuse_code_exp' nemo_skills/pipeline -g '!**/tests/**' | head -50

Length of output: 5232

🏁 Script executed:

#!/bin/bash # Search for documentation or comments about experiment naming uniqueness rg -nP -B2 -A5 'experiment.*name|expname.*unique' nemo_skills/pipeline -g '!**/tests/**'

Length of output: 14431

🏁 Script executed:

#!/bin/bash # Inspect the simplified_recipe.py context to see how eval() is called in a loop and what variables control it cat recipes/openmathreasoning/scripts/simplified_recipe.py | sed -n '100,180p'

Length of output: 3228

Include {training_backend} suffix in final_eval expname to avoid collision across backend iterations.

The final_eval() function loops over each training_backend but uses the same expname=f"{expname_prefix}-final-eval" for all backends. Since run_after correctly targets different training experiments per backend (...training-{training_backend}), each eval invocation should have a distinct experiment name:

- expname=f"{expname_prefix}-final-eval", + expname=f"{expname_prefix}-final-eval-{training_backend}",

This matches the pattern used in sft_nemo_rl() where each training backend gets its own experiment name. Without this, multiple eval() calls with identical names will conflict within the scheduler.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

eval(

ctx=wrap_arguments("++inference.tokens_to_generate=16384 "),

cluster=cluster,

model=f"{workspace}/training/qwen2.5-14b-improved-hf-{training_backend}",

server_type="vllm",

server_gpus=num_gpus,

benchmarks="aime24:8,aime25:8",

output_dir=f"{workspace}/evals/after-training-{training_backend}",

num_jobs=1,

expname=f"{expname_prefix}-final-eval",

run_after=f"{expname_prefix}-training-{training_backend}",

wandb_name=f"{expname_prefix}-final-eval" if not wandb_params["disable_wandb"] else None,

wandb_project=wandb_params["wandb_project"],

)

eval(

ctx=wrap_arguments("++inference.tokens_to_generate=16384 "),

cluster=cluster,

model=f"{workspace}/training/qwen2.5-14b-improved-hf-{training_backend}",

server_type="vllm",

server_gpus=num_gpus,

benchmarks="aime24:8,aime25:8",

output_dir=f"{workspace}/evals/after-training-{training_backend}",

num_jobs=1,

expname=f"{expname_prefix}-final-eval-{training_backend}",

run_after=f"{expname_prefix}-training-{training_backend}",

wandb_name=f"{expname_prefix}-final-eval" if not wandb_params["disable_wandb"] else None,

wandb_project=wandb_params["wandb_project"],

)

🤖 Prompt for AI Agents

In recipes/openmathreasoning/scripts/simplified_recipe.py around lines 146 to 159, the eval() calls use a fixed expname f"{expname_prefix}-final-eval" (and the same wandb_name) for every training_backend, causing name collisions across backend iterations; modify the expname to include the training_backend suffix (e.g., f"{expname_prefix}-final-eval-{training_backend}") and do the same for wandb_name when wandb is enabled so each backend’s final eval has a unique experiment name.

@wedu-nvidia let's do this

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Signed-off-by: Wei Du <wedu@nvidia.com>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

recipes/openmathreasoning/scripts/simplified_recipe.py (1)

146-160: CRITICAL: Duplicate expname collision across backends remains unresolved.

Lines 156 and 158 still use the same expname and wandb_name for all backend iterations, causing scheduler name collisions. This issue was previously flagged but not addressed.

Apply this diff to fix the collision:

         eval(
             ctx=wrap_arguments("++inference.tokens_to_generate=16384 "),
             cluster=cluster,
             model=f"{workspace}/training-{training_backend}/qwen2.5-14b-improved-hf",
             server_type="vllm",
             server_gpus=num_gpus,
             benchmarks="aime24:8,aime25:8",
             output_dir=f"{workspace}/evals/after-training-{training_backend}",
             num_jobs=1,
-            expname=f"{expname_prefix}-final-eval",
+            expname=f"{expname_prefix}-final-eval-{training_backend}",
             run_after=f"{expname_prefix}-training-{training_backend}",
-            wandb_name=f"{expname_prefix}-final-eval" if not wandb_params["disable_wandb"] else None,
+            wandb_name=f"{expname_prefix}-final-eval-{training_backend}" if not wandb_params["disable_wandb"] else None,
             wandb_project=wandb_params["wandb_project"],
         )

🧹 Nitpick comments (2)

recipes/openmathreasoning/scripts/simplified_recipe.py (2)

27-27: Unused parameters in function signature.

The backend, num_gpus, and wandb_params parameters are not used in this function. If they are included for signature consistency across recipe functions, consider documenting this intent or removing them if unnecessary.

49-49: Unused backend parameter.

The backend parameter is not used in this function. If included for signature consistency, consider documenting this or removing if unnecessary.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c540577 and 72dfacf.

📒 Files selected for processing (1)

recipes/openmathreasoning/scripts/simplified_recipe.py (6 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

recipes/openmathreasoning/scripts/simplified_recipe.py (3)

nemo_skills/pipeline/nemo_rl/sft.py (1)

sft_nemo_rl (214-498)

nemo_skills/pipeline/cli.py (1)

wrap_arguments (42-51)

nemo_skills/pipeline/eval.py (1)

eval (42-545)

🪛 Ruff (0.14.1)

recipes/openmathreasoning/scripts/simplified_recipe.py

27-27: Unused function argument: num_gpus

(ARG001)

27-27: Unused function argument: backend

(ARG001)

27-27: Unused function argument: wandb_params

(ARG001)

49-49: Unused function argument: backend

(ARG001)

163-163: Unused function argument: backend

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: unit-tests
GitHub Check: pre-commit

🔇 Additional comments (4)

recipes/openmathreasoning/scripts/simplified_recipe.py (4)

113-141: LGTM! Previous issues resolved.

The training loop correctly addresses the previously flagged concerns:

base_args is copied for each backend iteration (line 123), preventing FSDP-specific flags from affecting subsequent backends

Backend-specific output_dir (line 130) prevents checkpoint/log collisions

Backend-specific expname (line 138) ensures unique experiment names

163-163: Unused backend parameter is acceptable here.

The backend parameter is unused in initial_eval, which is appropriate since this function evaluates the baseline model before any backend-specific training occurs. The parameter is likely included for signature consistency with other recipe functions.

213-219: LGTM! CLI backend argument properly configured.

The --backend argument correctly supports multiple backends with appropriate validation and defaults.

227-239: LGTM! Backend parameter correctly propagated.

The backend argument is properly passed through all recipe function calls via the args tuple.

Kipok · 2025-10-23T20:34:11Z

+        eval(
+            ctx=wrap_arguments("++inference.tokens_to_generate=16384 "),
+            cluster=cluster,
+            model=f"{workspace}/training/qwen2.5-14b-improved-hf-{training_backend}",
+            server_type="vllm",
+            server_gpus=num_gpus,
+            benchmarks="aime24:8,aime25:8",
+            output_dir=f"{workspace}/evals/after-training-{training_backend}",
+            num_jobs=1,
+            expname=f"{expname_prefix}-final-eval",
+            run_after=f"{expname_prefix}-training-{training_backend}",
+            wandb_name=f"{expname_prefix}-final-eval" if not wandb_params["disable_wandb"] else None,
+            wandb_project=wandb_params["wandb_project"],
+        )


@wedu-nvidia let's do this

Kipok · 2025-10-23T20:34:49Z

            f"{args.expname_prefix}-final-eval",
            f"{args.expname_prefix}-baseline-eval",
        ],
+        reuse_code=False,


why this change?

It should not change, I am not sure it got changed, I will revise back.

Kipok · 2025-10-23T20:35:16Z



-def prepare(workspace, cluster, num_gpus, expname_prefix, wandb_params):
+def prepare(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):


why do we need backend in this and run_sdg and initial_eval?

I thought original used it for convenience so that it can pass *args. Now I reivsed it.

Signed-off-by: Wei Du <wedu@nvidia.com>

Signed-off-by: Wei Du <wedu@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>

wedu-nvidia added 3 commits October 17, 2025 17:11

update omr

a5d8b7d

Signed-off-by: Wei Du <wedu@nvidia.com>

fix bug

9d664aa

Signed-off-by: Wei Du <wedu@nvidia.com>

update

2c91d86

Signed-off-by: Wei Du <wedu@nvidia.com>

Merge branch 'main' into wedu/omr-receipe

d694d41

wedu-nvidia requested a review from Kipok October 18, 2025 03:11

coderabbitai Bot reviewed Oct 18, 2025

View reviewed changes

Kipok requested changes Oct 19, 2025

View reviewed changes

wedu-nvidia added 2 commits October 20, 2025 13:30

add backend as list

818981a

Signed-off-by: Wei Du <wedu@nvidia.com>

add backend as list

c540577

Signed-off-by: Wei Du <wedu@nvidia.com>

coderabbitai Bot reviewed Oct 20, 2025

View reviewed changes

fix path

72dfacf

Signed-off-by: Wei Du <wedu@nvidia.com>

coderabbitai Bot reviewed Oct 20, 2025

View reviewed changes

Kipok requested changes Oct 23, 2025

View reviewed changes

wedu-nvidia added 2 commits October 24, 2025 08:43

cleaned code

f2c7fd4

Signed-off-by: Wei Du <wedu@nvidia.com>

Merge branch 'main' into wedu/omr-receipe

4aaee4e

Signed-off-by: Wei Du <wedu@nvidia.com>

Kipok approved these changes Oct 25, 2025

View reviewed changes

Kipok merged commit 30b4527 into main Oct 25, 2025
5 checks passed

Kipok deleted the wedu/omr-receipe branch October 25, 2025 16:46

dgtm777 pushed a commit that referenced this pull request Oct 29, 2025

Update omr receipe (#973)

3fd84cc

Signed-off-by: Wei Du <wedu@nvidia.com>

dgtm777 pushed a commit that referenced this pull request Mar 18, 2026

Update omr receipe (#973)

f405e3e

Signed-off-by: Wei Du <wedu@nvidia.com> Signed-off-by: dgitman <dgitman@nvidia.com>



		def prepare(workspace, cluster, num_gpus, expname_prefix, wandb_params):
		def prepare(workspace, cluster, num_gpus, expname_prefix, backend, wandb_params):

Conversation

wedu-nvidia commented Oct 18, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

wedu-nvidia commented Oct 18, 2025

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Kipok left a comment

Choose a reason for hiding this comment

Uh oh!

wedu-nvidia commented Oct 20, 2025

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kipok Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Kipok Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Kipok Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

wedu-nvidia Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Kipok Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

wedu-nvidia Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wedu-nvidia commented Oct 18, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 18, 2025 •

edited

Loading

coderabbitai Bot Oct 20, 2025 •

edited

Loading