Multiple control signal (2D traj, depth points, instance points) - where is related code?

Hi,
Congrats on your amazing work & thank you for releasing the code! 
I have a question about the ControlNet conditioning.

In the paper, it says
“2D trajectories Gaussian heatmap and concatenate the trajectories, instance points, and depth points to serve as control signal, which is injected into the Stable Video Diffusion (SVD) [5] using ControlNet.”
and also,
in the Gradio demo code (gradio_run.py), I see extraction of depth maps (DepthAnythingV2) and instance masks (SAM). 
However, in the main pipeline (pipeline_stable_video_diffusion_mask_control.py L452–457), only the 2D trajectory Gaussian heatmap appears to be passed as controlnet_cond.

Could you clarify how the depth and instance signals are injected into ControlNet?
Thanks in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multiple control signal (2D traj, depth points, instance points) - where is related code? #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multiple control signal (2D traj, depth points, instance points) - where is related code? #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions