Skip to content

[BIONEMO-2473] Added tests for Evo2 LoRA fine-tuning#1060

Merged
balvisio merged 4 commits intomainfrom
dev/ba/BIONEMO-2473-add-evo2-lora-tests
Sep 25, 2025
Merged

[BIONEMO-2473] Added tests for Evo2 LoRA fine-tuning#1060
balvisio merged 4 commits intomainfrom
dev/ba/BIONEMO-2473-add-evo2-lora-tests

Conversation

@balvisio
Copy link
Copy Markdown
Collaborator

@balvisio balvisio commented Aug 21, 2025

Description

Fixes and added test for Evo2LoRA

Type of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactor
  • Documentation update
  • Other (please describe):

CI Pipeline Configuration

Configure CI behavior by applying the relevant labels:

Note

By default, the notebooks validation tests are skipped unless explicitly enabled.

Authorizing CI Runs

We use copy-pr-bot to manage authorization of CI
runs on NVIDIA's compute resources.

  • If a pull request is opened by a trusted user and contains only trusted changes, the pull request's code will
    automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123)
  • If a pull request is opened by an untrusted user or contains untrusted changes, an NVIDIA org member must leave an
    /ok to test comment on the pull request to trigger CI. This will need to be done for each new commit.

Usage

# TODO: Add code snippet

Pre-submit Checklist

  • I have tested these changes locally
  • I have updated the documentation accordingly
  • I have added/updated tests as needed
  • All existing tests pass successfully

Summary by CodeRabbit

  • New Features

    • Expose controls for mock dataset sizes (train/val/test) for training runs.
    • LoRA finetuning flow simplified; LoRA integration now passes a preconstructed transform and checkpoint paths accept plain strings.
  • Tests

    • Added end-to-end integration tests for pretraining, finetuning, and LoRA finetuning with artifact and loss validations.
    • Introduced shared test helpers for constructing small training/finetune commands and consolidated imports.
  • Chores

    • Updated/cleaned license header boilerplate in tests.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Aug 21, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@balvisio balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch 5 times, most recently from 01125ba to 9750267 Compare August 21, 2025 18:38
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Aug 21, 2025

Codecov Report

❌ Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.92%. Comparing base (832b244) to head (540edf4).
⚠️ Report is 8 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...ackages/bionemo-evo2/src/bionemo/evo2/run/train.py 33.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1060      +/-   ##
==========================================
- Coverage   79.93%   79.92%   -0.02%     
==========================================
  Files         160      160              
  Lines       11859    11858       -1     
==========================================
- Hits         9480     9478       -2     
- Misses       2379     2380       +1     
Files with missing lines Coverage Δ
...kages/bionemo-evo2/src/bionemo/evo2/models/peft.py 17.64% <ø> (ø)
...ackages/bionemo-evo2/src/bionemo/evo2/run/train.py 12.50% <33.33%> (-0.31%) ⬇️

... and 1 file with indirect coverage changes

@balvisio balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch from 9750267 to f5cbdd6 Compare August 25, 2025 16:25
@balvisio
Copy link
Copy Markdown
Collaborator Author

/ok to test f5cbdd6

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 4, 2025

Walkthrough

Updated training entrypoint to pass synthetic dataset sizes to MockDataModule, changed Evo2LoRA import path and CLI arg type for LoRA checkpoints, integrate a constructed Evo2LoRA as HyenaModel's model_transform and callback during LoRA finetune, added shared test helpers, refactored tests to use them, and added an end-to-end finetune integration test.

Changes

Cohort / File(s) Summary
Training entrypoint updates
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py
Changed public import of Evo2LoRA to bionemo.evo2.models.peft. CLI arg --lora-checkpoint-path type changed from Path to str. MockDataModule now initialized with num_train_samples, num_val_samples, num_test_samples. When --lora-finetune is enabled, construct lora_transform = Evo2LoRA(peft_ckpt_path=...), pass it as model_transform to HyenaModel, and append it to callbacks (replacing previous ModelTransform callback usage).
Test helpers & package init (new)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py, sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py
Added license-only __init__.py. Added small_training_cmd and small_training_finetune_cmd helpers that build CLI invocation strings for mock training/finetuning (params: path, max_steps, val_check, optional global batch size, devices, create_tflops_callback, additional args).
Test refactor (reuse helpers)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py
Removed local command-builder functions and now import small_training_cmd and small_training_finetune_cmd from .common. Minor docstring formatting change.
Finetuning integration test (new)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py
New pytest integration test test_train_evo2_finetune_runs and helper extract_val_losses. Runs mock pretraining and finetuning flows (including optional LoRA/PEFT paths and resume-from-LoRA), asserts logs, checkpoints, TensorBoard events, and monotonic validation-loss behavior across runs.
PEFT module header tidy
sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py
Removed a large license header block; no functional or API changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as CLI (train_evo2)
  participant Train as train.py
  participant Data as MockDataModule
  participant Model as HyenaModel
  participant PEFT as Evo2LoRA
  participant CB as Callbacks

  CLI->>Train: Parse args (incl. sample counts, --lora-finetune, --lora-checkpoint-path)
  Train->>Data: Init(num_train_samples, num_val_samples, num_test_samples)
  alt lora_finetune enabled
    Train->>PEFT: Create Evo2LoRA(peft_ckpt_path=args.lora_checkpoint_path)
    Train->>Model: Init(model_transform=lora_transform)
    Train->>CB: Append lora_transform to callbacks
  else
    Train->>Model: Init(model_transform=None)
  end
  Train->>Model: Fit(data=Data, callbacks=CB)
  Model-->>CLI: Emit logs, checkpoints, TFEvents
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

I twitch my ears at flags anew,
Mock seeds counted through and through.
LoRA threads a tiny tune,
Callbacks clap beneath the moon.
Tests hop in — checkpoints in sight. 🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description contains a short summary, type-of-change checkbox, CI notes, and a pre-submit checklist but is missing the detailed "Description" required by the repository template and the "Usage" code snippet; it does not enumerate the specific files/behavioural changes, test details, or the current test/pass status (the "All existing tests pass successfully" item is unchecked), so the description is incomplete for reviewers to fully understand the scope and how to exercise the changes. Please expand the Description to list the key code and test changes and the motivation for the bugfix, add the Usage code snippet showing how to run the new tests or CLI flags, and update the pre-submit checklist to state test results or remaining work; also note any CI labels required to run slow/notebook tests so reviewers can reproduce the CI runs.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title "[BIONEMO-2473] Added tests for Evo2 LoRA fine-tuning" is concise, directly related to the primary change (adding Evo2 LoRA fine-tuning tests), and includes the issue key; it accurately reflects the changes shown in the diff (new/updated tests and related train code adjustments) and is clear for teammates scanning history.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dev/ba/BIONEMO-2473-add-evo2-lora-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jwilber
Copy link
Copy Markdown
Collaborator

jwilber commented Sep 4, 2025

/ok to test e53f754

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)

24-27: Remove duplicated --limit-val-batches flag

Each helper appends --limit-val-batches twice. Keep a single occurrence to avoid confusion about which value “wins.”

-        "--model-size 1b_nv --num-layers 4 --hybrid-override-pattern SDH* --limit-val-batches 1 "
+        "--model-size 1b_nv --num-layers 4 --hybrid-override-pattern SDH* "
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
-        "--model-size 1b_nv --num-layers 4 --hybrid-override-pattern SDH* --limit-val-batches 1 "
+        "--model-size 1b_nv --num-layers 4 --hybrid-override-pattern SDH* "
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "

Also applies to: 44-47


20-29: Quote shell arguments to be path-safe

Paths (result dirs, ckpt dirs) should be shell-quoted to survive spaces/special chars.

+import shlex
@@
-        f"train_evo2 --mock-data --result-dir {path} --devices {devices} "
+        f"train_evo2 --mock-data --result-dir {shlex.quote(str(path))} --devices {devices} "
@@
-        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args}"
+        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args}"
@@
-        f"train_evo2 --mock-data --result-dir {path} --devices {devices} "
+        f"train_evo2 --mock-data --result-dir {shlex.quote(str(path))} --devices {devices} "
@@
-        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args} --ckpt-dir {prev_ckpt} "
+        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args} --ckpt-dir {shlex.quote(str(prev_ckpt))} "

Also applies to: 41-50

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (3)

494-496: CLI type change to str is fine; add early path validation

If --lora-finetune is set and a checkpoint path is provided, fail fast when the path doesn’t exist.

     parser.add_argument("--lora-checkpoint-path", type=str, default=None, help="LoRA checkpoint path")

Add right after args are parsed (in train or just after parse_args returns):

@@ def train(args: argparse.Namespace) -> nl.Trainer:
+    if args.lora_finetune and args.lora_checkpoint_path:
+        if not Path(args.lora_checkpoint_path).exists():
+            raise FileNotFoundError(f"LoRA checkpoint path not found: {args.lora_checkpoint_path}")

657-659: Guard callback type and None

Appending lora_transform is correct when set; consider asserting it’s not None to catch misconfigurations early.

-    if args.lora_finetune:
-        callbacks.append(lora_transform)
+    if args.lora_finetune:
+        assert lora_transform is not None, "Evo2LoRA should be initialized when --lora-finetune is set."
+        callbacks.append(lora_transform)

665-672: Pass actual model type to FLOPs callback

Hardcoding "hyena" could skew FLOPs reporting for Mamba.

-        flop_meas_callback = FLOPsMeasurementCallback(
-            model_config,
-            data_module,
-            "hyena",
-        )
+        flop_meas_callback = FLOPsMeasurementCallback(model_config, data_module, model_type)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)

52-61: Deduplicate --limit-val-batches and quote paths in mamba helpers

Mirror the fixes from common.py to avoid flag duplication and path issues.

-        f"train_evo2 --mock-data --result-dir {path} --devices {devices} "
-        "--model-size hybrid_mamba_8b --num-layers 2 --hybrid-override-pattern M- --limit-val-batches 1 "
+        f"train_evo2 --mock-data --result-dir {shlex.quote(str(path))} --devices {devices} "
+        "--model-size hybrid_mamba_8b --num-layers 2 --hybrid-override-pattern M- "
@@
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
-        f"train_evo2 --mock-data --result-dir {path} --devices {devices} "
-        "--model-size hybrid_mamba_8b --num-layers 2 --hybrid-override-pattern M- --limit-val-batches 1 "
+        f"train_evo2 --mock-data --result-dir {shlex.quote(str(path))} --devices {devices} "
+        "--model-size hybrid_mamba_8b --num-layers 2 --hybrid-override-pattern M- "
@@
-        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args} --ckpt-dir {prev_ckpt}"
+        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args} --ckpt-dir {shlex.quote(str(prev_ckpt))}"

Don’t forget to import shlex at top if applying.

Also applies to: 63-73

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c59d333 and e53f754.

📒 Files selected for processing (6)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/run/peft.py (0 hunks)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (2 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (2 hunks)
💤 Files with no reviewable changes (1)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/run/peft.py
🧰 Additional context used
🧬 Code graph analysis (2)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py (2)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)
  • small_training_cmd (20-29)
  • small_training_finetune_cmd (32-50)
sub-packages/bionemo-testing/src/bionemo/testing/subprocess_utils.py (1)
  • run_command_in_subprocess (108-129)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)
  • small_training_cmd (20-29)
  • small_training_finetune_cmd (32-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (rust)
🔇 Additional comments (4)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1)

1-14: Package marker OK

Licensing header + package init only. No issues.

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)

38-50: Capture helpers LGTM

stdout/stderr capture with distributed state context is clean.

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py (2)

109-121: Verify LoRA resume path semantics

You pass a trainer checkpoint dir as --lora-checkpoint-path. If Evo2LoRA expects a LoRA-specific checkpoint (not a full trainer ckpt), this may silently no-op or misload. Please confirm the expected format/path and adjust to the LoRA artifact location if different.

Would you like me to scan the repo for where Evo2LoRA writes its PEFT artifacts and update this test accordingly?


24-33: Test structure and assertions look solid

End-to-end pretrain → finetune flow and basic artifact checks are clear.

Also applies to: 71-79, 101-107

@balvisio balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch 2 times, most recently from f5cbdd6 to 852987b Compare September 21, 2025 15:26
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)

788-804: Initialize lora_transform before model branching to avoid undefined variable.

lora_transform is only set in the Hyena branch; using --lora-finetune with non‑Hyena models leads to UnboundLocalError later.

Apply this diff:

     # Create model based on selected model type
+    lora_transform = None
     if model_type == "hyena":
         if args.model_size not in HYENA_MODEL_OPTIONS:
             raise ValueError(f"Invalid model size for Hyena: {args.model_size}")
         model_config = HYENA_MODEL_OPTIONS[args.model_size](**config_modifiers_init)
         if args.no_weight_decay_embeddings:
             # Override the default weight decay condition for Hyena with our bionemo version that also excludes
             #  embeddings
             model_config.hyena_no_weight_decay_cond_fn = hyena_no_weight_decay_cond_with_embeddings
-        # Lora adaptors configuration
-        lora_transform = None
+        # LoRA adapters configuration
         if args.lora_finetune:
             lora_transform = Evo2LoRA(peft_ckpt_path=args.lora_checkpoint_path)
 
         model = llm.HyenaModel(model_config, tokenizer=data_module.tokenizer, model_transform=lora_transform)
🧹 Nitpick comments (6)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)

666-674: Use effective steps when early_stop_on_step is set (mock data sizing).

If --early-stop-on-step is used, training samples should reflect that to avoid oversizing mock data.

Apply this diff:

     if args.mock_data:
-        data_module = MockDataModule(
+        effective_max_steps = args.early_stop_on_step or args.max_steps
+        data_module = MockDataModule(
             seq_length=args.seq_length,
             micro_batch_size=args.micro_batch_size,
             global_batch_size=global_batch_size,
-            num_train_samples=args.max_steps * global_batch_size,
+            num_train_samples=effective_max_steps * global_batch_size,
             num_val_samples=args.limit_val_batches * global_batch_size,
             num_test_samples=1,
             num_workers=args.workers,
             tokenizer=tokenizer,
         )
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (5)

100-106: Stabilize assertion: check overall improvement instead of per‑step monotonicity.

Per‑step non‑increase is flaky on short, synthetic runs. Assert net improvement and parse all entries.

Apply this diff:

-    val_losses = extract_val_losses(stdout_pretrain, val_steps)
-
-    for i in range(1, len(val_losses)):
-        assert val_losses[i][1] <= val_losses[i - 1][1], (
-            f"Validation loss increased at step {val_losses[i][0]}: {val_losses[i][1]} > {val_losses[i - 1][1]}"
-        )
+    val_losses = extract_val_losses(stdout_pretrain, 1)
+    assert val_losses, "No validation-loss entries found in logs."
+    assert val_losses[-1][1] <= val_losses[0][1], (
+        f"Validation loss did not improve overall: first={val_losses[0][1]} last={val_losses[-1][1]}"
+    )

157-164: Apply the same stable assertion for finetune phase.

Apply this diff:

-    val_losses_ft = extract_val_losses(stdout_finetune, val_steps)
-
-    # Check that each validation loss is less than or equal to the previous one
-    for i in range(1, len(val_losses_ft)):
-        assert val_losses_ft[i][1] <= val_losses_ft[i - 1][1], (
-            f"Validation loss increased at step {val_losses_ft[i][0]}: {val_losses_ft[i][1]} > {val_losses_ft[i - 1][1]}"
-        )
+    val_losses_ft = extract_val_losses(stdout_finetune, 1)
+    assert val_losses_ft, "No validation-loss entries found in logs (finetune)."
+    assert val_losses_ft[-1][1] <= val_losses_ft[0][1], (
+        f"Finetune loss did not improve overall: first={val_losses_ft[0][1]} last={val_losses_ft[-1][1]}"
+    )

171-182: Avoid variable shadowing; keep resume stdout separate.

Reuse of stdout_finetune is confusing; use stdout_resume.

Apply this diff:

-        stdout_finetune: str = run_command_in_subprocess(command=command_resume_finetune, path=str(tmp_path))
+        stdout_resume: str = run_command_in_subprocess(command=command_resume_finetune, path=str(tmp_path))

196-202: Resume phase: use the renamed variable and stable assertion.

Apply this diff:

-        val_losses_ft = extract_val_losses(stdout_finetune, val_steps)
-
-        # Check that each validation loss is less than or equal to the previous one
-        for i in range(1, len(val_losses_ft)):
-            assert val_losses_ft[i][1] <= val_losses_ft[i - 1][1], (
-                f"Validation loss increased at step {val_losses_ft[i][0]}: {val_losses_ft[i][1]} > {val_losses_ft[i - 1][1]}"
-            )
+        val_losses_ft = extract_val_losses(stdout_resume, 1)
+        assert val_losses_ft, "No validation-loss entries found in logs (resume finetune)."
+        assert val_losses_ft[-1][1] <= val_losses_ft[0][1], (
+            f"Resume finetune loss did not improve overall: first={val_losses_ft[0][1]} last={val_losses_ft[-1][1]}"
+        )

107-112: Remove duplicate event-file existence check (already done above).

Apply this diff:

-    # Check if directory with tensorboard logs exists
-    assert tensorboard_dir.exists(), "TensorBoard logs folder does not exist."
-    # Recursively search for files with tensorboard logger
-    event_files = list(tensorboard_dir.rglob("events.out.tfevents*"))
-    assert event_files, f"No TensorBoard event files found under {tensorboard_dir}"
-    assert len(matching_subfolders) == 1, "Only one checkpoint subfolder should be found."
+    assert len(matching_subfolders) == 1, "Only one checkpoint subfolder should be found."
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 437004d and b0ac603.

📒 Files selected for processing (6)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (0 hunks)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (4 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1 hunks)
💤 Files with no reviewable changes (1)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py
✅ Files skipped from review due to trivial changes (1)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/init.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py
🧰 Additional context used
🧬 Code graph analysis (2)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)
sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (1)
  • Evo2LoRA (26-279)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (3)
sub-packages/bionemo-testing/src/bionemo/testing/subprocess_utils.py (1)
  • run_command_in_subprocess (108-129)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)
  • small_training_cmd (20-37)
  • small_training_finetune_cmd (40-60)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)
  • test_train_evo2_finetune_runs (103-171)
🔇 Additional comments (3)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (3)

51-51: Correct import path for Evo2LoRA.

Importing Evo2LoRA from models.peft matches the new location.


612-614: CLI type change to str aligns with Evo2LoRA signature.

--lora-checkpoint-path as str matches the peft constructor.


827-829: Guard callback append to avoid UnboundLocalError for non‑Hyena.

Append only when the transform was constructed.

Apply this diff:

-    if args.lora_finetune:
-        callbacks.append(lora_transform)
+    if lora_transform is not None:
+        callbacks.append(lora_transform)

@balvisio balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch from b0ac603 to 0f54e08 Compare September 23, 2025 19:58
@balvisio
Copy link
Copy Markdown
Collaborator Author

/ok to test 0f54e08

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (4)

54-60: Remove duplicate --limit-val-batches 1 flag in Mamba cmd.

Flag is set twice; keep one.

-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} "

67-73: Remove duplicate --limit-val-batches 1 flag in Mamba finetune cmd.

Same duplicate as above.

-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} "

77-85: Remove duplicate --limit-val-batches 1 flag in Llama cmd.

Simplify to a single occurrence.

-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} "

91-98: Remove duplicate --limit-val-batches 1 flag in Llama finetune cmd.

Drop the repeated flag.

-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} "
🧹 Nitpick comments (1)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)

53-99: Consider moving Mamba/Llama helpers to common.py for reuse.

Unify helper builders (Mamba/Llama) in tests/bionemo/evo2/run/common.py to avoid drift with small_training_cmd variants and reduce duplication.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b0ac603 and 0f54e08.

📒 Files selected for processing (6)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (0 hunks)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (4 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1 hunks)
💤 Files with no reviewable changes (1)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py
  • sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/init.py
🧰 Additional context used
🧬 Code graph analysis (1)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (2)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)
  • small_training_cmd (20-37)
  • small_training_finetune_cmd (40-60)
sub-packages/bionemo-testing/src/bionemo/testing/torch.py (1)
  • check_fp8_support (21-33)
🔇 Additional comments (2)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (2)

33-34: Relative import LGTM; fixes testEnv packaging issues.

Good switch to package-relative import; aligns with prior feedback and avoids ModuleNotFoundError in CI.


39-41: Docstring tweak LGTM.

No functional impact; concise and clear.

@balvisio balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch from 0f54e08 to 79361ed Compare September 23, 2025 21:22
@balvisio
Copy link
Copy Markdown
Collaborator Author

/ok to test 79361ed

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)

613-613: Validate LoRA checkpoint path before use.

Type change to str is fine, but add a lightweight existence check to fail fast if a bad path is provided.

Add before constructing Evo2LoRA (Hyena branch):

from pathlib import Path

if args.lora_finetune and args.lora_checkpoint_path:
    ckpt_p = Path(args.lora_checkpoint_path)
    if not ckpt_p.exists():
        raise FileNotFoundError(f"LoRA checkpoint not found: {ckpt_p}")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f54e08 and 79361ed.

📒 Files selected for processing (6)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (0 hunks)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (4 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1 hunks)
💤 Files with no reviewable changes (1)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/init.py
🧰 Additional context used
🧬 Code graph analysis (1)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)
sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (1)
  • Evo2LoRA (26-279)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: changed-files
  • GitHub Check: pre-commit
  • GitHub Check: changed-dirs
  • GitHub Check: Analyze (rust)
🔇 Additional comments (3)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (3)

827-829: Fix NameError: lora_transform is undefined for non‑Hyena models.

When --lora-finetune is used with mamba/llama, lora_transform isn’t defined, causing a crash.

Apply this minimal fix to avoid the exception:

-    if args.lora_finetune:
-        callbacks.append(lora_transform)
+    if 'lora_transform' in locals() and lora_transform is not None:
+        callbacks.append(lora_transform)

Additionally, initialize lora_transform and fail fast when LoRA is requested for unsupported models (place just after determining model_type and before branching):

# before: "Create model based on selected model type"
lora_transform = None
if args.lora_finetune and model_type != "hyena":
    raise ValueError("--lora-finetune is currently supported only for Hyena models.")

51-51: Import path update verified—no stale imports remain.


671-674: Use eval batch size (no grad accumulation) for val/test sample counts.

Validation/test don't use gradient accumulation — divide out args.grad_acc_batches for val/test sample counts.

-            num_train_samples=args.max_steps * global_batch_size,
-            num_val_samples=args.limit_val_batches * global_batch_size,
-            num_test_samples=1,
+            num_train_samples=args.max_steps * global_batch_size,
+            num_val_samples=args.limit_val_batches * max(1, global_batch_size // max(1, args.grad_acc_batches)),
+            num_test_samples=1,

@balvisio balvisio enabled auto-merge September 24, 2025 04:30
@balvisio balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch from 46c8423 to 8027918 Compare September 24, 2025 14:04
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (2)

613-613: Arg type change to str is fine; add lightweight CLI validation.

Ensure the flag is only used with --lora-finetune (and optionally check local file existence).

Add post-parse validation before returning from parse_args:

args_ns = parser.parse_args(args=args)
if args_ns.lora_checkpoint_path and not args_ns.lora_finetune:
    parser.error("--lora-checkpoint-path requires --lora-finetune.")
# Optional: check existence if intended to be a local path
# from pathlib import Path
# if args_ns.lora_checkpoint_path and not str(args_ns.lora_checkpoint_path).startswith(("s3://","gs://")) and not Path(args_ns.lora_checkpoint_path).exists():
#     parser.error(f"--lora-checkpoint-path not found: {args_ns.lora_checkpoint_path}")
return args_ns

671-673: Sanity-check MockDataModule sample counts; ensure test has at least one full batch.

Suggest using at least one global batch for test to avoid empty/partial-batch edge cases.

Apply this diff:

-            num_test_samples=1,
+            num_test_samples=global_batch_size,

Also, please confirm your NeMo MockDataModule supports the num_train_samples/num_val_samples/num_test_samples kwargs in your pinned version to avoid a TypeError at construction.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46c8423 and 8027918.

📒 Files selected for processing (6)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (0 hunks)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (4 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (1 hunks)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1 hunks)
💤 Files with no reviewable changes (1)
  • sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py
  • sub-packages/bionemo-evo2/tests/bionemo/evo2/run/init.py
🧰 Additional context used
🧬 Code graph analysis (1)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)
sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (1)
  • Evo2LoRA (26-279)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (rust)
🔇 Additional comments (2)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (2)

51-51: Evo2LoRA import path update looks correct.

Import aligns with the class location and downstream usage.


827-829: Fix UnboundLocalError: lora_transform may be undefined for non‑Hyena models.

If --lora-finetune is passed with mamba/llama, lora_transform is not set, causing an error. Append only when it exists; initialize before branching.

Apply this diff:

-    if args.lora_finetune:
-        callbacks.append(lora_transform)
+    if 'lora_transform' in locals() and lora_transform is not None:
+        callbacks.append(lora_transform)

And initialize before the model-type branching:

# before: if model_type == "hyena":
lora_transform = None

@balvisio balvisio disabled auto-merge September 24, 2025 15:26
@balvisio balvisio enabled auto-merge September 24, 2025 15:26
@balvisio
Copy link
Copy Markdown
Collaborator Author

/ok to test 8027918

@balvisio
Copy link
Copy Markdown
Collaborator Author

/ok to test 5126bea

@balvisio
Copy link
Copy Markdown
Collaborator Author

/ok to test 2ba44da

@dorotat-nv dorotat-nv added ciflow:slow Run slow single GPU integration tests marked as @pytest.mark.slow for bionemo2 and removed SKIP_CI labels Sep 25, 2025
@balvisio
Copy link
Copy Markdown
Collaborator Author

/ok to test 540edf4

@balvisio balvisio added this pull request to the merge queue Sep 25, 2025
Merged via the queue into main with commit dd4f626 Sep 25, 2025
19 checks passed
@balvisio balvisio deleted the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch September 25, 2025 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow:slow Run slow single GPU integration tests marked as @pytest.mark.slow for bionemo2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants