Skip to content

Multiple control signal (2D traj, depth points, instance points) - where is related code? #5

@blacktime14

Description

@blacktime14

Hi,
Congrats on your amazing work & thank you for releasing the code!
I have a question about the ControlNet conditioning.

In the paper, it says
“2D trajectories Gaussian heatmap and concatenate the trajectories, instance points, and depth points to serve as control signal, which is injected into the Stable Video Diffusion (SVD) [5] using ControlNet.”
and also,
in the Gradio demo code (gradio_run.py), I see extraction of depth maps (DepthAnythingV2) and instance masks (SAM).
However, in the main pipeline (pipeline_stable_video_diffusion_mask_control.py L452–457), only the 2D trajectory Gaussian heatmap appears to be passed as controlnet_cond.

Could you clarify how the depth and instance signals are injected into ControlNet?
Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions