Deepy is WanGP's assistant for multi-step media work. It can generate, inspect, edit, extract, transcribe, merge, and transform image, video, and audio while keeping conversation context.
This guide covers:
- general guidelines
- enabling Deepy
- configuring Deepy in the web UI
- linking WanGP settings files to Deepy generation tools
- using selected and previous media naturally
- understanding which generation settings Deepy can override directly
- asking Deepy about available LoRAs and current defaults
- using Deepy from the CLI
Please note Deepy make errors, so be sure to verify Deep's work
Once enabled (see below), Deepy becomes accessible by opening Deepy chat window when you click on the left dock Ask Deepy
Deepy is capable to generate image, video & audio, and then combine them to produce new media. All content produced by Deepy will be found in the Image / Video Gallery and the Audio Gallery at the top right of the WanGP Video Generator tab.
Deepy can also work with User Imported Media:
- Expand the section
Media Info / Late Post Processing / Import Media - Switch to the
Import Media to Galleriestab - Select files to Import
- Click
Import Videos / Images / Audio Files
Once the media are in the galleries, you can refer to them using wording like the last audio file, the selected video or describe their content (Deepy will query the prompts stored in the generation metadata if they exist).
Deepy simply can not infer the best generation settings based for your request, since the combinations are too many and depend on the generation model you want to use. So Deepy relies on predefined Templates Settings for its main 6 generation tools (Generate Image, Generate Video, Edit Image, Generate Video with Speaker, Generate Audio from description, Generate Audio from Sample).
WanGP comes with builtin templates ready to use but you may as well link presaved settings. You can access Deepy settings by clicking the Settings control on the right of the Deepy chat window.
In the web UI, Deepy settings changes take effect for the current Deepy session as soon as you make them. Click Save Deepy Settings at the bottom of the settings panel when you want to write those settings to disk for future WanGP sessions.
You can also define Default Width & Height to use for all the generation tools in Deepy Settings Window. These will be used only if the checkbox Use Properties defined in Settings in Templates files is not checked. This is convenient if you want to override the values defined in the templates without modifying them.
Last but not least you can ask directly Deepy to override the following templates settings: width, height, num of frames, fps, loras or num inference steps.
Deepy is available only when both of these conditions are met:
Enable Deepyis turned on.- Prompt Enhancer is set to a supported Qwen3.5VL mode.
Open the Configuration plugin and go to the Prompt Enhancer / Deepy tab.
Required Prompt Enhancer modes:
Qwen3.5VL Abliterated 4BQwen3.5VL Abliterated 9B
Deepy settings in that tab:
Enable Deepy: turns Deepy on or offDeepy VRAM Loading Mode: controls whether Deepy stays in VRAM, unloads when idle, or unloads only when another WanGP component needs VRAM. The more Deepy stays in VRAM, the more responsive.Context Window Tokens: how much conversation and tool history Deepy tries to keep liveCustom System Prompt: extra instructions appended to Deepy on the next user turn
When the requirement is met, the Ask Deepy launcher appears in the WanGP web UI.
Open Ask Deepy, then open the Settings panel.
The settings panel contains two expanded sections:
Generation PropertiesTemplate Settings used by Tools
All changes in this panel are used immediately by the current Deepy web session. To keep them for future sessions, click Save Deepy Settings at the bottom of the panel.
-
Auto-abort or remove Deepy-started generation on Stop/Reset.
Controls whether Deepy-created queue work is cancelled or removed when you stop/reset Deepy. -
Use Properties defined in Templates Settings files.
When enabled, Deepy uses the selected tool template as-is. When disabled, Deepy still starts from the template, but replaces only width, height, video frame count, and seed with the panel defaults below. -
WidthandHeight
Default size overrides used only when template properties are disabled. -
Number of Frames
Default frame-count override forGenerate Video, used only when template properties are disabled. -
Seed (-1 for random)
Default seed override, used only when template properties are disabled.-1means random.
Inference steps, FPS, LoRAs, and other model-specific values remain template-driven unless you ask for one of the supported per-request overrides described later in this guide.
Deepy has 6 generation-tool template selectors:
Video GeneratorVideo With SpeechImage GeneratorImage EditorSpeech From DescriptionSpeech From Sample
Each row has:
- a dropdown that selects the current template for that tool
+to link that tool to the currently selected WanGP user settings file (in the dropdown in the upper left part of video gen tab )trashto remove the current live link and go back to the previous or default template
Changing a template selector updates the active Deepy web session immediately. Click Save Deepy Settings if you want to reuse the same selectors the next time you launch WanGP.
Deepy shows the selected template in the chat transcript for generation tools, for example:
Generate Image [Z Image Turbo]
Generate Video [LTX-2 2.3 Distilled]
Edit Image [Flux Klein 9B]
Click Save Deepy Settings at the bottom of the Deepy settings panel to persist the current web settings to disk.
That save includes:
- generation-property values such as auto-abort behavior, template-property usage, width, height, frame count, and seed
- the currently selected Deepy template for each generation tool
Deepy templates are either:
- built-in Deepy templates shipped with WanGP
- live links to WanGP user settings files
Practical workflow:
- configure a normal WanGP generation the way you want
- save it as a WanGP user settings file
- select that user settings JSON in WanGP's
Lora / Settingsdropdown - open Deepy settings
- click
+next to the Deepy tool you want to link - confirm the link
When you use the tool later, Deepy reads the linked WanGP settings file directly, so changes to that file are picked up automatically.
- Only WanGP user settings selected from the
Lora / Settingsdropdown can be linked this way. - System profiles and LoRA presets are rejected.
- If the linked WanGP settings file changes later, Deepy sees the updated content automatically.
- If the linked file disappears, Deepy falls back to that tool's default template.
- If the linked file still exists but is no longer eligible for that tool, the tool returns an eligibility error.
- Built-in templates cannot be deleted from the UI.
- Linked templates are the right place for model-specific settings that Deepy does not expose directly. Deepy can still override width, height, frame count, FPS, inference steps, and LoRAs on the supported tools.
Deepy is designed to let you refer to existing media naturally.
In practice, Deepy will usually:
- prefer the currently selected image, video, or audio item when you say
selected,current,this image,this video,this audio, orthis frame - use the selected video's current playback time when you refer to
the selected frameorthe current frame - resolve short references such as
last image,previous video, orlast audio - resolve older outputs when you describe a previous result
- ask for clarification instead of inventing a result when a reference is ambiguous
You can still use internal media ids such as image_1 or video_3, but usually you do not need to.
For an image:
- click the image you want
- ask Deepy something like:
edit this image so the sky is stormyinspect the selected image and tell me whether the hands look correctuse the selected image as the start frame for a short videouse this image and the last audio clip to make a talking video
For a video:
- select the video
- scrub the player to the moment you care about
- ask Deepy something like:
inspect this frame and tell me whether the face is sharpextract the selected frame as an imagecut a 3 second clip starting at the selected timetranscribe this videomute this videoreplace the audio of the selected video with the last extracted audio
For audio:
- select or import an audio file
- ask Deepy something like:
transcribe this audiotranscribe this audio with word timestampscreate speech from this sample saying: Welcome to WanGPuse this audio with the selected image to make a talking video
If your voice sample is inside a video, Deepy can extract the audio first.
Deepy can also resolve references such as:
last imageprevious videolast audiothe robot dancing imageimage_2video_3
- generate images, edit images, generate videos, generate talking videos from a still image plus speech audio, and create speech audio from a voice description or a voice sample
- create solid-color frames for transitions, blank frames, or color cards
- inspect images and video frames, and read local image, video, or audio details such as dimensions, duration, FPS, frame count, or audio track count
- extract images, video clips, or audio clips; transcribe audio or video; mute videos; replace audio; resize/crop media; and merge videos
- tell you which LoRAs are available for the current generation tool and which defaults a generation tool will use right now
- answer WanGP-specific usage questions by searching the bundled docs
Deepy can transcribe either audio or video.
- Segment timestamps are returned by default.
- Ask for word timestamps if you need more detailed timing.
- If a source has multiple audio tracks, mention which track you want.
Example requests:
Transcribe the selected video.
Transcribe audio track 2 from the selected video.
Extract the video excerpt that starts with 'I will be back'.
Generate a cinematic image of a robot violinist on a rainy Paris rooftop at night.
Edit the selected image so the background becomes a neon alley while keeping the character identity, and use 8 inference steps.
Generate a short video of a paper boat floating through a glowing cave river at 24 fps with 97 frames and 8 inference steps.
Generate a video of a dog playing under the rain using the Lego lora
Use the selected portrait and the last audio clip to make a talking video.
Create speech from this sample saying: Welcome to WanGP.
How do I use VACE for outpainting?
Multi-step requests:
1) Generate an image of a robot disco dancing on top of a horse in a nightclub.
2) Edit the image so the setting stays the same, but the robot has gotten off the horse and the horse is standing next to the robot.
3) Verify that the edited image matches the description; if it does not, generate another one.
4) Generate a transition between the two images.
Create a high quality portrait that represents you well. Then create a speech sample in which you introduce your capabilities. When done generate a talking video from the portrait and the generated speech.
Launch Deepy in CLI mode with:
python wgp.py --ask-deepyAt startup, the CLI prints the Deepy logo and preloads the prompt-enhancer runtime so Deepy is ready before the first prompt.
Interactive multiline entry:
Enter: send the current promptCtrl+Enter: insert a newline on terminals that expose itAlt+Enter: insert a newlineCtrl+J: newline fallbackCtrl+S: stop the current Deepy turn while it is runningShift+Enter: not available here because the console reports it as plainEnter
The CLI has its own virtual gallery. Add files to it, select one, and optionally set a playback time or frame for the selected video.
Examples:
/video E:\media\my_clip.mp4
/frame 120
inspect the selected frame and tell me whether the subject is centered
/audio E:\media\voice.wav
transcribe the selected audio with word timestamps
When a Deepy tool generates media in CLI mode, the CLI prints the generated output path.
Media:
/add <path>: add and select an image, video, or audio file/image <path>: add and select an image file/video <path>: add and select a video file/audio <path>: add and select an audio file/list [scope]: list known media;scopecan beall,media,image,video, oraudio/media [scope]: alias for/list/clear-media: remove all virtual gallery media
Selection:
/select <ref>: select media by id, list index, or name fragment/select-video <media_id>: select a video by media id/selected: show the currently selected media/selected-video: show the selected video media id/time <secs>: set the selected video's playback time/frame [index]: show or set the selected video frame, 0-based
Deepy settings:
/settings: show the current CLI Deepy settings/size [WxH]: show or set default generation size and disable template properties/frames [count]: show or set defaultgen_videoframe count and disable template properties/seed [value]: show or set the default generation seed and disable template properties/template <tool> <variant>: set the template for any Deepy generation tool/templates [tool]: list available template variants/template-props [on|off]: show or toggle whether Deepy uses resolution, frame, and seed properties from templates
Session:
/help: print the CLI command summary/reset: clear the Deepy conversation but keep the virtual gallery media/quit: exit the CLI session
Examples:
/template gen_image "Z Image Turbo"
/template gen_video "LTX-2 2.3 Distilled"
/size 1280x720
/frames 97
/seed -1
- Deepy works best when your request clearly states the goal and how current media should be reused.
- For multi-step tasks, list the steps in order.
- If you need a model-specific setting that Deepy cannot override directly, store it in the linked template.
- Ask Deepy for available LoRAs or current defaults when you switch templates and want to confirm the setup.
- For image and video requests, be explicit about any must-keep details such as subject identity, composition, or mood.
- If you want Deepy to use the current video moment, scrub the selected video first, then refer to
this frameorthe selected frame. - For transcription, mention if you want word timestamps or a specific audio track.
- If a tool fails, Deepy will tell you rather than inventing a result.
- For WanGP-specific questions, you can ask Deepy directly instead of searching the docs manually.
- Install GGUF kernels for fast inference and low VRAM.