[BIONEMO-2473] Added tests for Evo2 LoRA fine-tuning by balvisio · Pull Request #1060 · NVIDIA/bionemo-framework

balvisio · 2025-08-21T12:22:34Z

Description

Fixes and added test for Evo2LoRA

Type of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Refactor
Documentation update
Other (please describe):

CI Pipeline Configuration

Configure CI behavior by applying the relevant labels:

SKIP_CI - Skip all continuous integration tests
INCLUDE_NOTEBOOKS_TESTS - Execute notebook validation tests in pytest
INCLUDE_SLOW_TESTS - Execute tests labelled as slow in pytest for extensive testing

Note

By default, the notebooks validation tests are skipped unless explicitly enabled.

Authorizing CI Runs

We use copy-pr-bot to manage authorization of CI
runs on NVIDIA's compute resources.

If a pull request is opened by a trusted user and contains only trusted changes, the pull request's code will
automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123)
If a pull request is opened by an untrusted user or contains untrusted changes, an NVIDIA org member must leave an
/ok to test comment on the pull request to trigger CI. This will need to be done for each new commit.

Usage

# TODO: Add code snippet

Pre-submit Checklist

I have tested these changes locally
I have updated the documentation accordingly
I have added/updated tests as needed
All existing tests pass successfully

Summary by CodeRabbit

New Features
- Expose controls for mock dataset sizes (train/val/test) for training runs.
- LoRA finetuning flow simplified; LoRA integration now passes a preconstructed transform and checkpoint paths accept plain strings.
Tests
- Added end-to-end integration tests for pretraining, finetuning, and LoRA finetuning with artifact and loss validations.
- Introduced shared test helpers for constructing small training/finetune commands and consolidated imports.
Chores
- Updated/cleaned license header boilerplate in tests.

copy-pr-bot · 2025-08-21T12:22:38Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

codecov-commenter · 2025-08-21T20:21:32Z

Codecov Report

❌ Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.92%. Comparing base (832b244) to head (540edf4).
⚠️ Report is 8 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
...ackages/bionemo-evo2/src/bionemo/evo2/run/train.py	33.33%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1060      +/-   ##
==========================================
- Coverage   79.93%   79.92%   -0.02%     
==========================================
  Files         160      160              
  Lines       11859    11858       -1     
==========================================
- Hits         9480     9478       -2     
- Misses       2379     2380       +1

Files with missing lines	Coverage Δ
...kages/bionemo-evo2/src/bionemo/evo2/models/peft.py	`17.64% <ø> (ø)`
...ackages/bionemo-evo2/src/bionemo/evo2/run/train.py	`12.50% <33.33%> (-0.31%)`	⬇️

... and 1 file with indirect coverage changes

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py

balvisio · 2025-08-25T16:28:55Z

/ok to test f5cbdd6

coderabbitai · 2025-09-04T17:30:57Z

Walkthrough

Updated training entrypoint to pass synthetic dataset sizes to MockDataModule, changed Evo2LoRA import path and CLI arg type for LoRA checkpoints, integrate a constructed Evo2LoRA as HyenaModel's model_transform and callback during LoRA finetune, added shared test helpers, refactored tests to use them, and added an end-to-end finetune integration test.

Changes

Cohort / File(s)	Summary
Training entrypoint updates `sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py`	Changed public import of `Evo2LoRA` to `bionemo.evo2.models.peft`. CLI arg `--lora-checkpoint-path` type changed from `Path` to `str`. `MockDataModule` now initialized with `num_train_samples`, `num_val_samples`, `num_test_samples`. When `--lora-finetune` is enabled, construct `lora_transform = Evo2LoRA(peft_ckpt_path=...)`, pass it as `model_transform` to `HyenaModel`, and append it to callbacks (replacing previous ModelTransform callback usage).
Test helpers & package init (new) `sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py`, `sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py`	Added license-only `__init__.py`. Added `small_training_cmd` and `small_training_finetune_cmd` helpers that build CLI invocation strings for mock training/finetuning (params: path, max_steps, val_check, optional global batch size, devices, create_tflops_callback, additional args).
Test refactor (reuse helpers) `sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py`	Removed local command-builder functions and now import `small_training_cmd` and `small_training_finetune_cmd` from `.common`. Minor docstring formatting change.
Finetuning integration test (new) `sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py`	New pytest integration test `test_train_evo2_finetune_runs` and helper `extract_val_losses`. Runs mock pretraining and finetuning flows (including optional LoRA/PEFT paths and resume-from-LoRA), asserts logs, checkpoints, TensorBoard events, and monotonic validation-loss behavior across runs.
PEFT module header tidy `sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py`	Removed a large license header block; no functional or API changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as CLI (train_evo2)
  participant Train as train.py
  participant Data as MockDataModule
  participant Model as HyenaModel
  participant PEFT as Evo2LoRA
  participant CB as Callbacks

  CLI->>Train: Parse args (incl. sample counts, --lora-finetune, --lora-checkpoint-path)
  Train->>Data: Init(num_train_samples, num_val_samples, num_test_samples)
  alt lora_finetune enabled
    Train->>PEFT: Create Evo2LoRA(peft_ckpt_path=args.lora_checkpoint_path)
    Train->>Model: Init(model_transform=lora_transform)
    Train->>CB: Append lora_transform to callbacks
  else
    Train->>Model: Init(model_transform=None)
  end
  Train->>Model: Fit(data=Data, callbacks=CB)
  Model-->>CLI: Emit logs, checkpoints, TFEvents

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

I twitch my ears at flags anew,
Mock seeds counted through and through.
LoRA threads a tiny tune,
Callbacks clap beneath the moon.
Tests hop in — checkpoints in sight. 🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The PR description contains a short summary, type-of-change checkbox, CI notes, and a pre-submit checklist but is missing the detailed "Description" required by the repository template and the "Usage" code snippet; it does not enumerate the specific files/behavioural changes, test details, or the current test/pass status (the "All existing tests pass successfully" item is unchecked), so the description is incomplete for reviewers to fully understand the scope and how to exercise the changes.	Please expand the Description to list the key code and test changes and the motivation for the bugfix, add the Usage code snippet showing how to run the new tests or CLI flags, and update the pre-submit checklist to state test results or remaining work; also note any CI labels required to run slow/notebook tests so reviewers can reproduce the CI runs.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "[BIONEMO-2473] Added tests for Evo2 LoRA fine-tuning" is concise, directly related to the primary change (adding Evo2 LoRA fine-tuning tests), and includes the issue key; it accurately reflects the changes shown in the diff (new/updated tests and related train code adjustments) and is clear for teammates scanning history.
Docstring Coverage	✅ Passed	Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch dev/ba/BIONEMO-2473-add-evo2-lora-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

jwilber · 2025-09-04T17:31:06Z

/ok to test e53f754

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (6)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)

24-27: Remove duplicated --limit-val-batches flag

Each helper appends --limit-val-batches twice. Keep a single occurrence to avoid confusion about which value “wins.”

-        "--model-size 1b_nv --num-layers 4 --hybrid-override-pattern SDH* --limit-val-batches 1 "
+        "--model-size 1b_nv --num-layers 4 --hybrid-override-pattern SDH* "
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "

-        "--model-size 1b_nv --num-layers 4 --hybrid-override-pattern SDH* --limit-val-batches 1 "
+        "--model-size 1b_nv --num-layers 4 --hybrid-override-pattern SDH* "
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "

Also applies to: 44-47

20-29: Quote shell arguments to be path-safe

Paths (result dirs, ckpt dirs) should be shell-quoted to survive spaces/special chars.

+import shlex
@@
-        f"train_evo2 --mock-data --result-dir {path} --devices {devices} "
+        f"train_evo2 --mock-data --result-dir {shlex.quote(str(path))} --devices {devices} "
@@
-        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args}"
+        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args}"
@@
-        f"train_evo2 --mock-data --result-dir {path} --devices {devices} "
+        f"train_evo2 --mock-data --result-dir {shlex.quote(str(path))} --devices {devices} "
@@
-        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args} --ckpt-dir {prev_ckpt} "
+        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args} --ckpt-dir {shlex.quote(str(prev_ckpt))} "

Also applies to: 41-50

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (3)

494-496: CLI type change to str is fine; add early path validation

If --lora-finetune is set and a checkpoint path is provided, fail fast when the path doesn’t exist.

     parser.add_argument("--lora-checkpoint-path", type=str, default=None, help="LoRA checkpoint path")

Add right after args are parsed (in train or just after parse_args returns):

@@ def train(args: argparse.Namespace) -> nl.Trainer:
+    if args.lora_finetune and args.lora_checkpoint_path:
+        if not Path(args.lora_checkpoint_path).exists():
+            raise FileNotFoundError(f"LoRA checkpoint path not found: {args.lora_checkpoint_path}")

657-659: Guard callback type and None

Appending lora_transform is correct when set; consider asserting it’s not None to catch misconfigurations early.

-    if args.lora_finetune:
-        callbacks.append(lora_transform)
+    if args.lora_finetune:
+        assert lora_transform is not None, "Evo2LoRA should be initialized when --lora-finetune is set."
+        callbacks.append(lora_transform)

665-672: Pass actual model type to FLOPs callback

Hardcoding "hyena" could skew FLOPs reporting for Mamba.

-        flop_meas_callback = FLOPsMeasurementCallback(
-            model_config,
-            data_module,
-            "hyena",
-        )
+        flop_meas_callback = FLOPsMeasurementCallback(model_config, data_module, model_type)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)

52-61: Deduplicate --limit-val-batches and quote paths in mamba helpers

Mirror the fixes from common.py to avoid flag duplication and path issues.

-        f"train_evo2 --mock-data --result-dir {path} --devices {devices} "
-        "--model-size hybrid_mamba_8b --num-layers 2 --hybrid-override-pattern M- --limit-val-batches 1 "
+        f"train_evo2 --mock-data --result-dir {shlex.quote(str(path))} --devices {devices} "
+        "--model-size hybrid_mamba_8b --num-layers 2 --hybrid-override-pattern M- "
@@
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "

-        f"train_evo2 --mock-data --result-dir {path} --devices {devices} "
-        "--model-size hybrid_mamba_8b --num-layers 2 --hybrid-override-pattern M- --limit-val-batches 1 "
+        f"train_evo2 --mock-data --result-dir {shlex.quote(str(path))} --devices {devices} "
+        "--model-size hybrid_mamba_8b --num-layers 2 --hybrid-override-pattern M- "
@@
-        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args} --ckpt-dir {prev_ckpt}"
+        f"--seq-length 16 --hidden-dropout 0.1 --attention-dropout 0.1 {additional_args} --ckpt-dir {shlex.quote(str(prev_ckpt))}"

Don’t forget to import shlex at top if applying.

Also applies to: 63-73

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c59d333 and e53f754.

📒 Files selected for processing (6)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/peft.py (0 hunks)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (2 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (2 hunks)

💤 Files with no reviewable changes (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/peft.py

🧰 Additional context used

🧬 Code graph analysis (2)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py (2)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)

small_training_cmd (20-29)

small_training_finetune_cmd (32-50)

sub-packages/bionemo-testing/src/bionemo/testing/subprocess_utils.py (1)

run_command_in_subprocess (108-129)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)

small_training_cmd (20-29)

small_training_finetune_cmd (32-50)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Analyze (rust)

🔇 Additional comments (4)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1)

1-14: Package marker OK

Licensing header + package init only. No issues.

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)

38-50: Capture helpers LGTM

stdout/stderr capture with distributed state context is clean.

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py (2)

109-121: Verify LoRA resume path semantics

You pass a trainer checkpoint dir as --lora-checkpoint-path. If Evo2LoRA expects a LoRA-specific checkpoint (not a full trainer ckpt), this may silently no-op or misload. Please confirm the expected format/path and adjust to the LoRA artifact location if different.

Would you like me to scan the repo for where Evo2LoRA writes its PEFT artifacts and update this test accordingly?

24-33: Test structure and assertions look solid

End-to-end pretrain → finetune flow and basic artifact checks are clear.

Also applies to: 71-79, 101-107

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)

788-804: Initialize lora_transform before model branching to avoid undefined variable.

lora_transform is only set in the Hyena branch; using --lora-finetune with non‑Hyena models leads to UnboundLocalError later.

Apply this diff:

     # Create model based on selected model type
+    lora_transform = None
     if model_type == "hyena":
         if args.model_size not in HYENA_MODEL_OPTIONS:
             raise ValueError(f"Invalid model size for Hyena: {args.model_size}")
         model_config = HYENA_MODEL_OPTIONS[args.model_size](**config_modifiers_init)
         if args.no_weight_decay_embeddings:
             # Override the default weight decay condition for Hyena with our bionemo version that also excludes
             #  embeddings
             model_config.hyena_no_weight_decay_cond_fn = hyena_no_weight_decay_cond_with_embeddings
-        # Lora adaptors configuration
-        lora_transform = None
+        # LoRA adapters configuration
         if args.lora_finetune:
             lora_transform = Evo2LoRA(peft_ckpt_path=args.lora_checkpoint_path)
 
         model = llm.HyenaModel(model_config, tokenizer=data_module.tokenizer, model_transform=lora_transform)

🧹 Nitpick comments (6)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)

666-674: Use effective steps when early_stop_on_step is set (mock data sizing).

If --early-stop-on-step is used, training samples should reflect that to avoid oversizing mock data.

Apply this diff:

     if args.mock_data:
-        data_module = MockDataModule(
+        effective_max_steps = args.early_stop_on_step or args.max_steps
+        data_module = MockDataModule(
             seq_length=args.seq_length,
             micro_batch_size=args.micro_batch_size,
             global_batch_size=global_batch_size,
-            num_train_samples=args.max_steps * global_batch_size,
+            num_train_samples=effective_max_steps * global_batch_size,
             num_val_samples=args.limit_val_batches * global_batch_size,
             num_test_samples=1,
             num_workers=args.workers,
             tokenizer=tokenizer,
         )

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (5)

100-106: Stabilize assertion: check overall improvement instead of per‑step monotonicity.

Per‑step non‑increase is flaky on short, synthetic runs. Assert net improvement and parse all entries.

Apply this diff:

-    val_losses = extract_val_losses(stdout_pretrain, val_steps)
-
-    for i in range(1, len(val_losses)):
-        assert val_losses[i][1] <= val_losses[i - 1][1], (
-            f"Validation loss increased at step {val_losses[i][0]}: {val_losses[i][1]} > {val_losses[i - 1][1]}"
-        )
+    val_losses = extract_val_losses(stdout_pretrain, 1)
+    assert val_losses, "No validation-loss entries found in logs."
+    assert val_losses[-1][1] <= val_losses[0][1], (
+        f"Validation loss did not improve overall: first={val_losses[0][1]} last={val_losses[-1][1]}"
+    )

157-164: Apply the same stable assertion for finetune phase.

Apply this diff:

-    val_losses_ft = extract_val_losses(stdout_finetune, val_steps)
-
-    # Check that each validation loss is less than or equal to the previous one
-    for i in range(1, len(val_losses_ft)):
-        assert val_losses_ft[i][1] <= val_losses_ft[i - 1][1], (
-            f"Validation loss increased at step {val_losses_ft[i][0]}: {val_losses_ft[i][1]} > {val_losses_ft[i - 1][1]}"
-        )
+    val_losses_ft = extract_val_losses(stdout_finetune, 1)
+    assert val_losses_ft, "No validation-loss entries found in logs (finetune)."
+    assert val_losses_ft[-1][1] <= val_losses_ft[0][1], (
+        f"Finetune loss did not improve overall: first={val_losses_ft[0][1]} last={val_losses_ft[-1][1]}"
+    )

171-182: Avoid variable shadowing; keep resume stdout separate.

Reuse of stdout_finetune is confusing; use stdout_resume.

Apply this diff:

-        stdout_finetune: str = run_command_in_subprocess(command=command_resume_finetune, path=str(tmp_path))
+        stdout_resume: str = run_command_in_subprocess(command=command_resume_finetune, path=str(tmp_path))

196-202: Resume phase: use the renamed variable and stable assertion.

Apply this diff:

-        val_losses_ft = extract_val_losses(stdout_finetune, val_steps)
-
-        # Check that each validation loss is less than or equal to the previous one
-        for i in range(1, len(val_losses_ft)):
-            assert val_losses_ft[i][1] <= val_losses_ft[i - 1][1], (
-                f"Validation loss increased at step {val_losses_ft[i][0]}: {val_losses_ft[i][1]} > {val_losses_ft[i - 1][1]}"
-            )
+        val_losses_ft = extract_val_losses(stdout_resume, 1)
+        assert val_losses_ft, "No validation-loss entries found in logs (resume finetune)."
+        assert val_losses_ft[-1][1] <= val_losses_ft[0][1], (
+            f"Resume finetune loss did not improve overall: first={val_losses_ft[0][1]} last={val_losses_ft[-1][1]}"
+        )

107-112: Remove duplicate event-file existence check (already done above).

Apply this diff:

-    # Check if directory with tensorboard logs exists
-    assert tensorboard_dir.exists(), "TensorBoard logs folder does not exist."
-    # Recursively search for files with tensorboard logger
-    event_files = list(tensorboard_dir.rglob("events.out.tfevents*"))
-    assert event_files, f"No TensorBoard event files found under {tensorboard_dir}"
-    assert len(matching_subfolders) == 1, "Only one checkpoint subfolder should be found."
+    assert len(matching_subfolders) == 1, "Only one checkpoint subfolder should be found."

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 437004d and b0ac603.

📒 Files selected for processing (6)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (0 hunks)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (4 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1 hunks)

💤 Files with no reviewable changes (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py

✅ Files skipped from review due to trivial changes (1)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/init.py

🚧 Files skipped from review as they are similar to previous changes (2)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py

🧰 Additional context used

🧬 Code graph analysis (2)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (1)

Evo2LoRA (26-279)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (3)

sub-packages/bionemo-testing/src/bionemo/testing/subprocess_utils.py (1)

run_command_in_subprocess (108-129)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)

small_training_cmd (20-37)

small_training_finetune_cmd (40-60)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)

test_train_evo2_finetune_runs (103-171)

🔇 Additional comments (3)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (3)
51-51: Correct import path for Evo2LoRA.

Importing Evo2LoRA from models.peft matches the new location.

612-614: CLI type change to str aligns with Evo2LoRA signature.

--lora-checkpoint-path as str matches the peft constructor.

827-829: Guard callback append to avoid UnboundLocalError for non‑Hyena.

Append only when the transform was constructed.

Apply this diff:
-    if args.lora_finetune:
-        callbacks.append(lora_transform)
+    if lora_transform is not None:
+        callbacks.append(lora_transform)

balvisio · 2025-09-23T19:58:58Z

/ok to test 0f54e08

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (4)
54-60: Remove duplicate --limit-val-batches 1 flag in Mamba cmd.

Flag is set twice; keep one.
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} "
67-73: Remove duplicate --limit-val-batches 1 flag in Mamba finetune cmd.

Same duplicate as above.
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} "
77-85: Remove duplicate --limit-val-batches 1 flag in Llama cmd.

Simplify to a single occurrence.
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} "
91-98: Remove duplicate --limit-val-batches 1 flag in Llama finetune cmd.

Drop the repeated flag.
-        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} --limit-val-batches 1 "
+        f"--max-steps {max_steps} --warmup-steps 1 --val-check-interval {val_check} "

🧹 Nitpick comments (1)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1)

53-99: Consider moving Mamba/Llama helpers to common.py for reuse.

Unify helper builders (Mamba/Llama) in tests/bionemo/evo2/run/common.py to avoid drift with small_training_cmd variants and reduce duplication.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b0ac603 and 0f54e08.

📒 Files selected for processing (6)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (0 hunks)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (4 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1 hunks)

💤 Files with no reviewable changes (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py

🚧 Files skipped from review as they are similar to previous changes (4)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/init.py

🧰 Additional context used

🧬 Code graph analysis (1)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (2)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (2)

small_training_cmd (20-37)

small_training_finetune_cmd (40-60)

sub-packages/bionemo-testing/src/bionemo/testing/torch.py (1)

check_fp8_support (21-33)

🔇 Additional comments (2)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (2)

33-34: Relative import LGTM; fixes testEnv packaging issues.

Good switch to package-relative import; aligns with prior feedback and avoids ModuleNotFoundError in CI.

39-41: Docstring tweak LGTM.

No functional impact; concise and clear.

balvisio · 2025-09-23T21:23:26Z

/ok to test 79361ed

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)
613-613: Validate LoRA checkpoint path before use.

Type change to str is fine, but add a lightweight existence check to fail fast if a bad path is provided.

Add before constructing Evo2LoRA (Hyena branch):
from pathlib import Path

if args.lora_finetune and args.lora_checkpoint_path:
    ckpt_p = Path(args.lora_checkpoint_path)
    if not ckpt_p.exists():
        raise FileNotFoundError(f"LoRA checkpoint not found: {ckpt_p}")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f54e08 and 79361ed.

📒 Files selected for processing (6)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (0 hunks)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (4 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1 hunks)

💤 Files with no reviewable changes (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py

🚧 Files skipped from review as they are similar to previous changes (4)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/init.py

🧰 Additional context used

🧬 Code graph analysis (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (1)

Evo2LoRA (26-279)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: changed-files
GitHub Check: pre-commit
GitHub Check: changed-dirs
GitHub Check: Analyze (rust)

🔇 Additional comments (3)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (3)
827-829: Fix NameError: lora_transform is undefined for non‑Hyena models.

When --lora-finetune is used with mamba/llama, lora_transform isn’t defined, causing a crash.

Apply this minimal fix to avoid the exception:
-    if args.lora_finetune:
-        callbacks.append(lora_transform)
+    if 'lora_transform' in locals() and lora_transform is not None:
+        callbacks.append(lora_transform)
Additionally, initialize lora_transform and fail fast when LoRA is requested for unsupported models (place just after determining model_type and before branching):
# before: "Create model based on selected model type"
lora_transform = None
if args.lora_finetune and model_type != "hyena":
    raise ValueError("--lora-finetune is currently supported only for Hyena models.")
51-51: Import path update verified—no stale imports remain.

671-674: Use eval batch size (no grad accumulation) for val/test sample counts.

Validation/test don't use gradient accumulation — divide out args.grad_acc_batches for val/test sample counts.
-            num_train_samples=args.max_steps * global_batch_size,
-            num_val_samples=args.limit_val_batches * global_batch_size,
-            num_test_samples=1,
+            num_train_samples=args.max_steps * global_batch_size,
+            num_val_samples=args.limit_val_batches * max(1, global_batch_size // max(1, args.grad_acc_batches)),
+            num_test_samples=1,

Signed-off-by: Bruno Alvisio <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (2)
613-613: Arg type change to str is fine; add lightweight CLI validation.

Ensure the flag is only used with --lora-finetune (and optionally check local file existence).

Add post-parse validation before returning from parse_args:
args_ns = parser.parse_args(args=args)
if args_ns.lora_checkpoint_path and not args_ns.lora_finetune:
    parser.error("--lora-checkpoint-path requires --lora-finetune.")
# Optional: check existence if intended to be a local path
# from pathlib import Path
# if args_ns.lora_checkpoint_path and not str(args_ns.lora_checkpoint_path).startswith(("s3://","gs://")) and not Path(args_ns.lora_checkpoint_path).exists():
#     parser.error(f"--lora-checkpoint-path not found: {args_ns.lora_checkpoint_path}")
return args_ns
671-673: Sanity-check MockDataModule sample counts; ensure test has at least one full batch.

Suggest using at least one global batch for test to avoid empty/partial-batch edge cases.

Apply this diff:
-            num_test_samples=1,
+            num_test_samples=global_batch_size,
Also, please confirm your NeMo MockDataModule supports the num_train_samples/num_val_samples/num_test_samples kwargs in your pinned version to avoid a TypeError at construction.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46c8423 and 8027918.

📒 Files selected for processing (6)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (0 hunks)
sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (4 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/__init__.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py (1 hunks)
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py (1 hunks)

💤 Files with no reviewable changes (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py

🚧 Files skipped from review as they are similar to previous changes (4)

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/common.py
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_finetune.py
sub-packages/bionemo-evo2/tests/bionemo/evo2/run/init.py

🧰 Additional context used

🧬 Code graph analysis (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (1)

sub-packages/bionemo-evo2/src/bionemo/evo2/models/peft.py (1)

Evo2LoRA (26-279)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Analyze (rust)

🔇 Additional comments (2)

sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py (2)
51-51: Evo2LoRA import path update looks correct.

Import aligns with the class location and downstream usage.

827-829: Fix UnboundLocalError: lora_transform may be undefined for non‑Hyena models.

If --lora-finetune is passed with mamba/llama, lora_transform is not set, causing an error. Append only when it exists; initialize before branching.

Apply this diff:
-    if args.lora_finetune:
-        callbacks.append(lora_transform)
+    if 'lora_transform' in locals() and lora_transform is not None:
+        callbacks.append(lora_transform)
And initialize before the model-type branching:
# before: if model_type == "hyena":
lora_transform = None

balvisio · 2025-09-24T15:53:36Z

/ok to test 8027918

balvisio · 2025-09-24T18:41:43Z

/ok to test 5126bea

balvisio · 2025-09-24T20:56:43Z

/ok to test 2ba44da

balvisio · 2025-09-25T14:16:22Z

/ok to test 540edf4

balvisio requested review from broland-hat, cspades, dorotat-nv, jomitchellnv, jstjohn, jwilber, polinabinder1, pstjohn, sichu2023, skothenhill-nv, trvachov and yzhang123 as code owners August 21, 2025 12:22

balvisio added INCLUDE_NOTEBOOKS_TESTS labels Aug 21, 2025

balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch 5 times, most recently from 01125ba to 9750267 Compare August 21, 2025 18:38

yzhang123 requested changes Aug 25, 2025

View reviewed changes

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py Show resolved Hide resolved

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py Show resolved Hide resolved

balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch from 9750267 to f5cbdd6 Compare August 25, 2025 16:25

jwilber approved these changes Sep 4, 2025

View reviewed changes

coderabbitai bot reviewed Sep 4, 2025

View reviewed changes

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_lora.py Outdated Show resolved Hide resolved

sub-packages/bionemo-evo2/tests/bionemo/evo2/run/test_train.py Outdated Show resolved Hide resolved

balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch 2 times, most recently from f5cbdd6 to 852987b Compare September 21, 2025 15:26

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch from b0ac603 to 0f54e08 Compare September 23, 2025 19:58

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch from 0f54e08 to 79361ed Compare September 23, 2025 21:22

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

balvisio enabled auto-merge September 24, 2025 04:30

[BIONEMO-2473] Added tests for Evo2 LoRA fine-tuning

8027918

Signed-off-by: Bruno Alvisio <[email protected]>

balvisio force-pushed the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch from 46c8423 to 8027918 Compare September 24, 2025 14:04

coderabbitai bot reviewed Sep 24, 2025

View reviewed changes

yzhang123 approved these changes Sep 24, 2025

View reviewed changes

balvisio disabled auto-merge September 24, 2025 15:26

balvisio enabled auto-merge September 24, 2025 15:26

Merge branch 'main' into dev/ba/BIONEMO-2473-add-evo2-lora-tests

5126bea

balvisio added SKIP_CI and removed INCLUDE_NOTEBOOKS_TESTS labels Sep 24, 2025

Merge branch 'main' into dev/ba/BIONEMO-2473-add-evo2-lora-tests

2ba44da

dorotat-nv added ciflow:slow Run slow single GPU integration tests marked as @pytest.mark.slow for bionemo2 and removed SKIP_CI labels Sep 25, 2025

Merge branch 'main' into dev/ba/BIONEMO-2473-add-evo2-lora-tests

540edf4

balvisio added this pull request to the merge queue Sep 25, 2025

Merged via the queue into main with commit dd4f626 Sep 25, 2025
19 checks passed

balvisio deleted the dev/ba/BIONEMO-2473-add-evo2-lora-tests branch September 25, 2025 18:24

Conversation

balvisio commented Aug 21, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of changes

CI Pipeline Configuration

Authorizing CI Runs

Usage

Pre-submit Checklist

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Aug 21, 2025

Uh oh!

codecov-commenter commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

balvisio commented Aug 25, 2025

Uh oh!

coderabbitai bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

jwilber commented Sep 4, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

balvisio commented Sep 23, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

balvisio commented Sep 23, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

balvisio commented Sep 24, 2025

Uh oh!

balvisio commented Sep 24, 2025

Uh oh!

balvisio commented Sep 24, 2025

Uh oh!

balvisio commented Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

balvisio commented Aug 21, 2025 •

edited by coderabbitai bot

Loading

codecov-commenter commented Aug 21, 2025 •

edited

Loading

coderabbitai bot commented Sep 4, 2025 •

edited

Loading