Skip to content

feat: Add Bernini-R model support (Wan video) (CORE-279)#14216

Merged
alexisrolland merged 18 commits into
Comfy-Org:masterfrom
kijai:bernini
Jun 9, 2026
Merged

feat: Add Bernini-R model support (Wan video) (CORE-279)#14216
alexisrolland merged 18 commits into
Comfy-Org:masterfrom
kijai:bernini

Conversation

@kijai

@kijai kijai commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Adds support for IC conditioning Berninin uses for Wan Video.

The weights are identical in structure to Wan22, so I chose to add the conditioning support to that instead of new model subclass, it's slightly different method than existing models use.

Bernini_testing_video_edit_02.json

Model weights:

https://huggingface.co/Comfy-Org/Bernini-R

Video edit:

Wan22_Bernini_00001.mp4
Wan22_Bernini_00023.mp4
Wan22_Bernini_00014.1.mp4

I2V:

Wan22_Bernini_00031.mp4

R2V:

Wan22_Bernini_00013.1.mp4

@alexisrolland alexisrolland changed the title feat: Add Bernini model support (Wan video) feat: Add Bernini model support (Wan video) (CORE-279) Jun 5, 2026
@alexisrolland alexisrolland marked this pull request as ready for review June 5, 2026 20:50
@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 63e6c6dc-1afb-4e98-b7eb-51a76e557a72

📥 Commits

Reviewing files that changed from the base of the PR and between 04752b8 and e7a6411.

📒 Files selected for processing (1)
  • comfy_extras/nodes_bernini.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • comfy_extras/nodes_bernini.py

📝 Walkthrough

Walkthrough

This PR adds in-context reference conditioning to WAN models. It extends the RoPE rotation mechanism to accept a source_id parameter for spatial rotation injection, integrates context latent token padding and trimming into the forward pass with separate RoPE frequency generation per stream, adds context latent support to the WAN21 model base class for conditioning and context window slicing, and introduces a new Bernini node that accepts source video, reference video, and reference images—resizing them while preserving aspect ratio, VAE-encoding them, and writing the encoded context into both positive and negative conditioning.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding Bernini-R model support with in-context conditioning for Wan video, and clearly references the associated ticket.
Description check ✅ Passed The description is directly related to the changeset, explaining the in-context conditioning support added for Bernini-R, model weights location, and providing sample outputs demonstrating the functionality.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zwukong

zwukong commented Jun 8, 2026

Copy link
Copy Markdown

R2V person with 4 views(front left back and head) not that similar to the ref, i wonder if we can get some improvements

@kijai kijai changed the title feat: Add Bernini model support (Wan video) (CORE-279) feat: Add Bernini-R model support (Wan video) (CORE-279) Jun 9, 2026
Comment thread comfy_extras/nodes_bernini.py Outdated
Comment thread comfy_extras/nodes_bernini.py Outdated
Comment thread comfy_extras/nodes_bernini.py Outdated
Comment thread comfy_extras/nodes_bernini.py Outdated
Comment thread comfy_extras/nodes_bernini.py Outdated
Comment thread comfy_extras/nodes_bernini.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@comfy/ldm/wan/model.py`:
- Around line 1668-1669: SCAILWanModel.forward_orig() currently references
self.patch_embedding_mask and handles ref_mask_latents / sam_latents, but
patch_embedding_mask is only created in SCAIL2WanModel.__init__(), causing
AttributeError for base SCAILWanModel; fix by moving the mask-specific logic out
of the base forward_orig into an override on SCAIL2WanModel (implement
forward_orig or forward_mask_handling in SCAIL2WanModel that applies
patch_embedding_mask to ref_mask_latents/sam_latents), or alternatively add a
safe default on SCAILWanModel (e.g. set self.patch_embedding_mask = None in
SCAILWanModel.__init__() and guard uses with an if self.patch_embedding_mask is
not None check around the x = x + ... lines referenced). Ensure all occurrences
(forward_orig references at lines cited and handling near sam_latents) follow
the same pattern so base class remains backward-compatible.
- Around line 1732-1757: The branch currently converts reference lengths into
patch counts (ref_t_patches) and passes those to super().rope_encode, but
rope_encode expects frame counts and will itself convert to patches, causing
double-conversion when patch_size[0] != 1; change the logic to compute and pass
frame counts (use the frame dimension directly from reference_latent and
pose_latents—e.g., reference_latent.shape[2] and pose_latents.shape[-3]) into
rope_encode and compute main frames as t - ref_frames, so call
super().rope_encode(ref_frames, ...) and super().rope_encode(main_frames, ...)
instead of using ref_t_patches/main_t_patches while leaving pose handling
(F_pose) consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a46af7b7-4d90-4f4f-b12b-daa3227d37e1

📥 Commits

Reviewing files that changed from the base of the PR and between 53316d5 and 6e72402.

📒 Files selected for processing (3)
  • comfy/ldm/wan/model.py
  • comfy/model_base.py
  • nodes.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • nodes.py
  • comfy/model_base.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@comfy/ldm/wan/model.py`:
- Around line 1668-1669: SCAILWanModel.forward_orig() currently references
self.patch_embedding_mask and handles ref_mask_latents / sam_latents, but
patch_embedding_mask is only created in SCAIL2WanModel.__init__(), causing
AttributeError for base SCAILWanModel; fix by moving the mask-specific logic out
of the base forward_orig into an override on SCAIL2WanModel (implement
forward_orig or forward_mask_handling in SCAIL2WanModel that applies
patch_embedding_mask to ref_mask_latents/sam_latents), or alternatively add a
safe default on SCAILWanModel (e.g. set self.patch_embedding_mask = None in
SCAILWanModel.__init__() and guard uses with an if self.patch_embedding_mask is
not None check around the x = x + ... lines referenced). Ensure all occurrences
(forward_orig references at lines cited and handling near sam_latents) follow
the same pattern so base class remains backward-compatible.
- Around line 1732-1757: The branch currently converts reference lengths into
patch counts (ref_t_patches) and passes those to super().rope_encode, but
rope_encode expects frame counts and will itself convert to patches, causing
double-conversion when patch_size[0] != 1; change the logic to compute and pass
frame counts (use the frame dimension directly from reference_latent and
pose_latents—e.g., reference_latent.shape[2] and pose_latents.shape[-3]) into
rope_encode and compute main frames as t - ref_frames, so call
super().rope_encode(ref_frames, ...) and super().rope_encode(main_frames, ...)
instead of using ref_t_patches/main_t_patches while leaving pose handling
(F_pose) consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a46af7b7-4d90-4f4f-b12b-daa3227d37e1

📥 Commits

Reviewing files that changed from the base of the PR and between 53316d5 and 6e72402.

📒 Files selected for processing (3)
  • comfy/ldm/wan/model.py
  • comfy/model_base.py
  • nodes.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • nodes.py
  • comfy/model_base.py
🛑 Comments failed to post (2)
comfy/ldm/wan/model.py (2)

1668-1669: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t make the base SCAIL path depend on a SCAIL-2-only layer.

SCAILWanModel.forward_orig() now uses self.patch_embedding_mask(...), but that layer is only created in SCAIL2WanModel.__init__(). If ref_mask_latents or sam_latents get forwarded into a plain SCAILWanModel, this now crashes with AttributeError. Please move the mask-specific handling into a SCAIL2WanModel override, or initialize a safe default on the base class.

As per coding guidelines, comfy/** changes should preserve backward compatibility because breaking changes affect all custom nodes.

Also applies to: 1677-1678, 1810-1815

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@comfy/ldm/wan/model.py` around lines 1668 - 1669,
SCAILWanModel.forward_orig() currently references self.patch_embedding_mask and
handles ref_mask_latents / sam_latents, but patch_embedding_mask is only created
in SCAIL2WanModel.__init__(), causing AttributeError for base SCAILWanModel; fix
by moving the mask-specific logic out of the base forward_orig into an override
on SCAIL2WanModel (implement forward_orig or forward_mask_handling in
SCAIL2WanModel that applies patch_embedding_mask to
ref_mask_latents/sam_latents), or alternatively add a safe default on
SCAILWanModel (e.g. set self.patch_embedding_mask = None in
SCAILWanModel.__init__() and guard uses with an if self.patch_embedding_mask is
not None check around the x = x + ... lines referenced). Ensure all occurrences
(forward_orig references at lines cited and handling near sam_latents) follow
the same pattern so base class remains backward-compatible.

Source: Coding guidelines


1732-1757: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep replacement-mode rope_encode() in frame units.

This branch converts the reference length to patch units and then feeds those values back into super().rope_encode(), which already does its own patch-size conversion. With patch_size[0] != 1, the ref/main RoPE lengths drift from the tokenized sequence and break non-default temporal patch sizes.

Suggested fix
-            ref_t_patches = 0
+            ref_t = 0
             if reference_latent is not None:
-                ref_t_patches = (reference_latent.shape[2] + (self.patch_size[0] // 2)) // self.patch_size[0]
-            main_t_patches = t - ref_t_patches
+                ref_t = reference_latent.shape[2]
+            main_t = t - ref_t

             parts = []
-            if ref_t_patches > 0:
+            if ref_t > 0:
                 ref_tf = {"rope_options": {"shift_y": REF_ROPE_H, "shift_x": 0.0, "scale_y": 1.0, "scale_x": 1.0}}
-                parts.append(super().rope_encode(ref_t_patches, h, w, t_start=0, device=device, dtype=dtype, transformer_options=ref_tf))
-            if main_t_patches > 0:
-                parts.append(super().rope_encode(main_t_patches, h, w, t_start=0, device=device, dtype=dtype, transformer_options=transformer_options))
+                parts.append(super().rope_encode(ref_t, h, w, t_start=0, device=device, dtype=dtype, transformer_options=ref_tf))
+            if main_t > 0:
+                parts.append(super().rope_encode(main_t, h, w, t_start=0, device=device, dtype=dtype, transformer_options=transformer_options))

As per coding guidelines, comfy/** changes should preserve backward compatibility because breaking changes affect all custom nodes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@comfy/ldm/wan/model.py` around lines 1732 - 1757, The branch currently
converts reference lengths into patch counts (ref_t_patches) and passes those to
super().rope_encode, but rope_encode expects frame counts and will itself
convert to patches, causing double-conversion when patch_size[0] != 1; change
the logic to compute and pass frame counts (use the frame dimension directly
from reference_latent and pose_latents—e.g., reference_latent.shape[2] and
pose_latents.shape[-3]) into rope_encode and compute main frames as t -
ref_frames, so call super().rope_encode(ref_frames, ...) and
super().rope_encode(main_frames, ...) instead of using
ref_t_patches/main_t_patches while leaving pose handling (F_pose) consistent.

Source: Coding guidelines

Comment thread comfy_extras/nodes_bernini.py Outdated
Co-authored-by: Alexis Rolland <alexis@comfy.org>
Comment thread comfy_extras/nodes_bernini.py Outdated
@alexisrolland alexisrolland self-requested a review June 9, 2026 23:46
@alexisrolland alexisrolland merged commit f8e51b6 into Comfy-Org:master Jun 9, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants