-
Notifications
You must be signed in to change notification settings - Fork 13.9k
feat: Add Bernini-R model support (Wan video) (CORE-279) #14216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+163
−2
Merged
Changes from 17 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
19475fd
Initial commit
kijai bb272ea
better
kijai 886b2e5
Merge remote-tracking branch 'upstream/master' into bernini
kijai 46ba987
Cleanup
kijai f87432b
Update nodes_bernini.py
kijai 2c7d256
Maybe fix context windows for v2v
kijai f3d0b07
Cleanup
kijai 1085cf2
Merge remote-tracking branch 'upstream/master' into bernini
kijai d9a28a9
Adjust context window
kijai 471a20a
Use separate reference image inputs instead
kijai 53316d5
Merge branch 'master' into bernini
alexisrolland 6e72402
Merge branch 'master' into bernini
alexisrolland 32c8043
Merge remote-tracking branch 'upstream/master' into bernini
kijai 8dd2242
Merge branch 'bernini' of https://github.com/kijai/ComfyUI into bernini
kijai 4a6119b
Adjust docstrins and tooltips
kijai 04752b8
Update nodes_bernini.py
kijai 01bb7f4
Apply suggestions from code review
alexisrolland e7a6411
Update comfy_extras/nodes_bernini.py
alexisrolland File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,115 @@ | ||
| import torch | ||
| from typing_extensions import override | ||
|
|
||
| import comfy.model_management | ||
| import comfy.utils | ||
| import node_helpers | ||
| from comfy_api.latest import ComfyExtension, io | ||
|
|
||
|
|
||
| def _resize_long_edge(image, max_size, stride=16): | ||
| """Resize (preserve aspect) so the long edge <= max_size, then snap each side to `stride`""" | ||
| h, w = image.shape[1], image.shape[2] | ||
| scale = min(max_size / max(h, w), 1.0) | ||
| nh = max(stride, round(h * scale / stride) * stride) | ||
| nw = max(stride, round(w * scale / stride) * stride) | ||
| return comfy.utils.common_upscale(image[:, :, :, :3].movedim(-1, 1), nw, nh, "area", "disabled").movedim(1, -1) | ||
|
|
||
|
|
||
| class BerniniConditioning(io.ComfyNode): | ||
| """Bernini in-context conditioning for a Wan2.2-A14B model. | ||
|
|
||
| Attaches the VAE-encoded source video / reference images to the conditioning | ||
| source video first, then each reference image | ||
|
|
||
| The task is inferred from which inputs are connected: | ||
| (nothing) -> t2v (text-to-video) | ||
| source_video -> v2v (video-to-video) | ||
| source_video + ref_images -> rv2v (reference-guided video editing) | ||
| ref_images only -> r2v (reference-to-video) | ||
| source_video + ref_video -> ads2v (insert image/video into video) | ||
|
|
||
| source_video is the edit base / canvas (resized to width x height). | ||
| reference_video is moving content to composite in. | ||
| Streams are ordered source_video, reference_video, then reference_images -> source_id (1, 2, 3, ...). | ||
| """ | ||
|
|
||
| @classmethod | ||
| def define_schema(cls): | ||
| return io.Schema( | ||
| node_id="BerniniConditioning", | ||
| display_name="Bernini Conditioning", | ||
| category="conditioning/video_models", | ||
| description="Conditioning node for Bernini in-context video/image conditioning. It can be used for the following tasks: t2v (text-to-video), v2v (video-to-video), rv2v (reference-guided video editing), r2v (reference-to-video), ads2v (insert image/video into video)." | ||
| "Reference images injected as in-context tokens (r2v, rv2v) are encoded independently at their own native aspect ratio (long edge capped at ref_max_size)", | ||
| inputs=[ | ||
| io.Conditioning.Input("positive"), | ||
| io.Conditioning.Input("negative"), | ||
| io.Vae.Input("vae"), | ||
| io.Int.Input("width", default=832, min=16, max=8192, step=16), | ||
| io.Int.Input("height", default=480, min=16, max=8192, step=16), | ||
| io.Int.Input("length", default=81, min=1, max=8192, step=4), | ||
| io.Int.Input("batch_size", default=1, min=1, max=4096), | ||
| io.Image.Input("source_video", optional=True, tooltip=( | ||
| "Source video to edit or restyle (v2v, rv2v). Resized to width/height and trimmed to length.")), | ||
| io.Image.Input("reference_video", optional=True, tooltip=( | ||
| "Video to insert into the source video (ads2v).")), | ||
| io.Autogrow.Input("reference_images", optional=True, | ||
| template=io.Autogrow.TemplatePrefix( | ||
| input=io.Image.Input("reference_image", tooltip=( | ||
| "A reference image injected as an in-context token (task r2v or rv2v).")), | ||
| prefix="reference_image_", min=0, max=8)), | ||
| io.Int.Input("ref_max_size", default=848, min=16, max=8192, step=16, optional=True, tooltip=( | ||
| "Max size for the long edge of reference_video and reference_images. Resized with preserved aspect ratio and snapped to 16px.")), | ||
| ], | ||
| outputs=[ | ||
| io.Conditioning.Output(display_name="positive"), | ||
| io.Conditioning.Output(display_name="negative"), | ||
| io.Latent.Output(display_name="latent"), | ||
| ], | ||
| ) | ||
|
|
||
| @classmethod | ||
| def execute(cls, positive, negative, vae, width, height, length, batch_size, | ||
| source_video=None, reference_video=None, reference_images=None, ref_max_size=848) -> io.NodeOutput: | ||
| latent = torch.zeros([batch_size, 16, ((length - 1) // 4) + 1, height // 8, width // 8], | ||
| device=comfy.model_management.intermediate_device()) | ||
|
|
||
| # source_video (1), reference_video (2), reference_images (3, 4, ...). | ||
| context = [] | ||
| if source_video is not None: | ||
| vid = comfy.utils.common_upscale(source_video[:length, :, :, :3].movedim(-1, 1), width, height, "area", "center").movedim(1, -1) | ||
| context.append(vae.encode(vid[:, :, :, :3])) | ||
|
|
||
| if reference_video is not None: | ||
| ref_vid = _resize_long_edge(reference_video[:length], ref_max_size) # moving content, native aspect | ||
| context.append(vae.encode(ref_vid[:, :, :, :3])) | ||
|
|
||
| # reference_images is an autogrow dict {reference_image_0: IMAGE, ...}; each slot is a | ||
| # separate stream at its own native aspect (a multi-image batch in one slot -> one stream per frame). | ||
| if reference_images: | ||
| for name in sorted(reference_images): | ||
| imgs = reference_images[name] | ||
| if imgs is None: | ||
| continue | ||
| for i in range(imgs.shape[0]): | ||
| img = _resize_long_edge(imgs[i:i + 1], ref_max_size) # native aspect per ref | ||
| context.append(vae.encode(img[:, :, :, :3])) | ||
|
|
||
| if context: | ||
| positive = node_helpers.conditioning_set_values(positive, {"context_latents": context}) | ||
| negative = node_helpers.conditioning_set_values(negative, {"context_latents": context}) | ||
|
|
||
| return io.NodeOutput(positive, negative, {"samples": latent}) | ||
|
|
||
|
|
||
| class BerniniExtension(ComfyExtension): | ||
| @override | ||
| async def get_node_list(self) -> list[type[io.ComfyNode]]: | ||
| return [ | ||
| BerniniConditioning, | ||
| ] | ||
|
|
||
|
|
||
| async def comfy_entrypoint() -> BerniniExtension: | ||
| return BerniniExtension() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.