Qwen Image 2512 & 2511 Training Results Are Next Level + 33 ComfyUI Presets for Image & Video Gen #355
FurkanGozukara
announced in
Tutorials
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Qwen Image 2512 & 2511 Training Results Are Next Level + 33 ComfyUI Presets for Image & Video Gen
Full tutorial: https://www.youtube.com/watch?v=RcoXd9v1t_c
Qwen Image 2512 Text to Image model is a massive upgrade in quality just as Qwen Image Edit 2511 was for image editing tasks based on commands. I have trained both Qwen Image older Base, Qwen Image 2512 new base, Qwen Image 2509 older edit model and Qwen Image 2511 newer edit model and compared them in this video. The results are astonishing. Moreover, we have converted all of our premium SwarmUI image and video generation presets into ComfyUI workflows. Just drag and drop the workflow and start using immediately. All models are downloaded with our premium 1-click to use model downloader app. ComfyUI and SwarmUI are also installed with 1-click installers and our ComfyUI fully supports Sage Attention, Flash Attention, xFormers, Triton and more with RTX 5000 series and all GPUs like 3000, 2000, 4000 series, etc for both Linux and Windows.
📂 Resources & Links:
ComfyUI Installer: [ https://www.patreon.com/posts/ComfyUI-Installers-105023709 ]
SwarmUI Installer, Model Auto Downloader and Presets: [ https://www.patreon.com/posts/SwarmUI-Install-Download-Models-Presets-114517862 ]
SwarmUI & ComfyUI Setup Guide for Windows: [ https://youtu.be/c3gEoAyL2IE ]
SwarmUI & ComfyUI Setup Guide for RunPod & Massed Compute: [ https://youtu.be/bBxgtVD3ek4 ]
Qwen Image Models Training Tutorial: [ https://youtu.be/DPX3eBTuO_Y ]
Wan 2.2 Model Training Tutorial: [ https://youtu.be/ocEkhAsPOs4 ]
Z Image Turbo Model Training Tutorial: [ https://youtu.be/ezD6QO14kRc ]
FLUX Models Fine Tuning / DreamBooth Training Tutorial: [ https://youtu.be/FvpWy1x5etM ]
FLUX Models LoRA Training Tutorial: [ https://youtu.be/nySGu12Y05k ]
Upload / Download Big Files Guide for RunPod & Massed Compute: [ https://youtu.be/X5WVZ0NMaTg ]
⏱️ Video Chapters:
00:00:00 Introduction to Qwen Image Base & 2512 Model Training Findings plus Qwen Image Edit 2509 & 2511 News
00:00:16 Announcement of Training Support for Qwen Image Edit Models & Upcoming Training Guide Overview
00:00:32 New FP8 Quantization Feature in SECourses Musubi Tuner & Full ComfyUI Preset Support Announcement
00:00:48 Converting All SwarmUI Presets into Drag-and-Drop ComfyUI Workflows & HTML Helper File Introduction
00:01:10 How to Use the New HTML File to Identify and Download Necessary Model Bundles via SwarmUI Downloader
00:01:24 Qwen Image Base Model Training Analysis: Observing Overfitting and Quality Degradation at 240 Epochs
00:02:04 Qwen Image 2512 Model Training Results: Massive Quality Boost & Reduced Noise Compared to Base Model
00:02:38 Showcasing New Qwen Image 2512 UHD Realism Presets Available in Both ComfyUI and SwarmUI
00:03:01 Visual Comparison: The Massive Leap in Realism Quality Between Previous Models and New 2512 Model
00:03:38 Instructions on Updating the Training Application via Bat File to Unlock New Qwen Model Support
00:04:03 Selecting Specific Model Versions in the Trainer: Text-to-Image vs Image Edit Plus 2509/2511 Variants
00:04:24 Using the Updated Model Downloader Application to Fetch New Qwen 2512 & 2511 Checkpoints Automatically
00:04:40 Wan 2.2 Training Tutorial Reference & Overview of Recently Updated High-Quality Presets for All Models
00:05:05 Training Dataset Configuration: Using Medium Quality Settings & OHWX Caption Strategy for Best Results
00:05:22 Testing Qwen Image Edit 2509 vs 2511 Models with GTA 5 Style Transfer Training on 120 and 180 Epochs
00:05:46 Analyzing GTA 5 Style Reproduction Accuracy on Both Edit Models Without Using Any Control Images
00:06:11 Style Transfer Experiment: Converting Input Image to Learned Style Using Edit Models Without Image-to-Image
00:06:40 Critical Findings on Edit Models: Lower Epoch Requirements & How Overtraining Alters Input Image Fidelity
00:07:09 Resolution Limitations for Style Transfer: Why 1328x1328 is the Optimal Sweet Spot for Editing Tasks
00:07:30 Reviewing Specific Presets Used for Qwen Image Edit Text-to-Image Generation vs Image Editing Tasks
00:08:02 How to Install and Update ComfyUI with the Latest Standalone Installer Script & Overwriting Old Files
00:08:15 Downloading Model Bundles for ComfyUI Presets Using the SwarmUI Downloader Tool & HTML Guide
00:08:39 Navigating the HTML Guide to Select Correct Model Bundles like Flux 2, Z Image Turbo or Qwen Core
00:09:12 Understanding Preset Limitations: Why Inpainting & Outpainting Workflows Require SwarmUI Editing Tab
00:09:57 Live Demonstration: Dragging and Dropping Flux SRPO Preset into ComfyUI Interface for Instant Setup
00:10:11 Configuring the Model Downloader to Target Your Specific ComfyUI Installation Path for Direct Downloads
00:10:47 Handling Potential VAE Subfolder Issues & Verifying High-Speed Downloads with 100% Hash Verification
00:11:06 Generating Images in ComfyUI: Python 3.10-3.13 Support, RTX 5000/4000/3000 Compatibility & Speed
00:11:54 Recommendation to Use SwarmUI with ComfyUI Backend for Best Experience & Access to All Features
00:12:20 Upcoming Tutorials Teaser: Trellis 2 3D Models & Video/Image Upscaler Guides Coming Soon
Video Transcription
00:00:00 Greetings everyone. Today I have got some important news for you. I have trained with Qwen
00:00:06 Image Base Model and Qwen Image 2512 model and we have got some very important findings. I also have
00:00:16 trained with Qwen Image Edit 2509 and 2511 models. I have got some very important news about that as
00:00:25 well. Furthermore, I will show how to train with newer Qwen Image models. Moreover, we have got a
00:00:32 new FP8 quantization in our SECourses Musubi Tuner application. This is the very best FP8
00:00:40 quantization. It supports so many new features. And we now fully support ComfyUI presets. What I
00:00:48 mean by that, we were supporting so many amazing presets, all the time gets updated with SwarmUI as
00:00:56 you know. Now we support all of them with ComfyUI as well. I have converted all of the presets into
00:01:04 ComfyUI workflows. So you will be able to just drag and drop them and use them right away. I also
00:01:10 have prepared a HTML file which shows you how to download necessary preset models right away with
00:01:18 our SwarmUI downloader. So you will be able to know which bundle to download for which preset.
00:01:24 So let's begin with the Qwen Image new 2512 model training. I have made fully fresh training both
00:01:33 Qwen Image Base Model and Qwen Image 2512 model. So the Qwen Image Base Model, this is 120 epoch,
00:01:43 this is 180 epoch, and this is 240 epochs. As you can see it becomes more overtrained and the
00:01:51 quality drops. This is realism preset, this is grid images so this is not cherry-pick. You see
00:01:57 degrades overtime. From this to this to this. So this is the quality of the Qwen Image Base Model,
00:02:04 the previous model. With the Qwen Image 2512 model, we got a huge boost in quality. So this
00:02:12 is 120 epoch. You see it has a lot of noise. This is 180 epoch. It becomes much better.
00:02:19 This is really good. And this is 240 epoch. It becomes perfect. You can also choose the middle
00:02:24 ground like 180 epoch but the quality difference, quality improvement is great compared to the our
00:02:31 older model. So this was our older model realism preset and this is our newer model. So for newer
00:02:38 model I have used our newer preset Qwen Image 2512 UHD Realism. Also it shows the date as
00:02:45 well. Now this also exist in our ComfyUI preset. You can see that Qwen Image 2512 UHD High Realism
00:02:53 and Qwen Image 2512 UHD Realism preset. So you can use this preset in ComfyUI or in SwarmUI.
00:03:01 And we got a huge boost in realism. So this is 120 epoch, this is 180 epoch, and this is 240
00:03:08 epoch. You see the realism quality, it is just top level. And these are just grid images. 120 epoch,
00:03:15 180 epoch, 240 epoch. You see the quality jump from previous model to new model is just next
00:03:24 level. This was our previous model high realism and this is our new model high realism. It is
00:03:31 just amazingly better compared to the previous model. So how you can train with newer model? Just
00:03:38 download the latest training zip file. The link will be in the description of the video. Overwrite
00:03:43 previous files, run the windows install and update .bat file. Then when you run the application in
00:03:50 the Qwen, you will see that now we support this way model version. So you can pick the Qwen
00:03:57 Image older and newer model. This is text to image model, this model, text to image model. Or you can
00:04:03 pick the Qwen Image Edit Plus 2509 and Qwen Image Edit 2511. 2511 is better. If you don't know how
00:04:11 to train, we have a full tutorial. It is here. The link will be also in the description of the video.
00:04:16 So exactly same, nothing changed. You just follow this video and now our model downloader supports
00:04:24 new models 2512 and 2509. You see when you start the model downloader you will see the newer 2512
00:04:33 model and 2511 model like this. We also now have Wan 2.2 training tutorial too. If you want you
00:04:40 can also watch this Wan 2.2 training tutorial and follow it. I have updated all the presets recently
00:04:47 so these are all up to date with highest quality. You can also read their descriptions and all of
00:04:53 these presets are also now available with ComfyUI. You know our ComfyUI installer is standalone
00:04:59 installer. It is not mandatory to use it with SwarmUI. Also for training I used our classical
00:05:05 dataset. This is medium quality and the training caption was just OHWX as in the tutorial video.
00:05:12 For testing the Qwen Image Edit 2509 and 2511, I used our GTA 5 style. So these first two images
00:05:22 are for the base models. You see this is Qwen 2509, this is Qwen 2511. So this is what the
00:05:29 model generates without training. And as we train, you see this is 09, this is 11, 120 epochs. This
00:05:37 is 09, this is 11, 180 epoch and you can see that both model generates GTA 5 style exactly.
00:05:46 Perfect. Perfect quality. And this is 2509 and this is 2511 Qwen Image Edit model. I didn't
00:05:55 use control images so these are just trained with the training images, nothing else. So both model
00:06:02 perfectly learns. In my opinion as expected, the 2511 model is better. Also it has an advantage.
00:06:11 What advantage it has? So I made a test about converting my image into learned style image.
00:06:19 How did I do that? I gave this image as an input image, as a prompt image and I said that convert
00:06:25 it into OHWX style, nothing else. So this is not image to image. This is Qwen Image Edit model
00:06:32 capability converting image. I did a big test and what I noticed is that as you train more epochs,
00:06:40 it starts to change input image itself. So this is more loyal to the input image and as I train
00:06:47 more becomes more like this, more changed, more overtrained. So the Qwen Image Edit model not
00:06:54 requires too many epochs as the Qwen Image Base text to image model. This is what I have found.
00:07:01 Moreover, as you go with higher resolution, this model loses its stylization editing capability. So
00:07:09 this is very high resolution and as you can see it is not that good. So the best resolution is 1328
00:07:15 to 1328 for this task. But when we generate images with our preset we are upscaling and it is working
00:07:24 perfect but for image editing task it didn't work that well for style transfer like this.
00:07:30 So for Qwen Image Edit models I have used the following presets. So this is text to image
00:07:38 generation with Qwen Image Edit model with stylization or with text to image model with
00:07:43 stylization. And for image editing I used this preset as I have shown in the previous tutorials.
00:07:50 Again just update the application and you can use these options to select which model you are going
00:07:55 to train. So what about how to use ComfyUI? So for using ComfyUI, download the latest zip
00:08:02 file. The link will be in the description of the video. Overwrite older files. Just run the windows
00:08:07 install or update comfyui.bat file. This will install and update your file. To download models
00:08:15 as a preset, as a bundle, you need to have SwarmUI model downloader application. The link will be in
00:08:21 the description of the video. Download the latest one. And start the windows start download models
00:08:26 app.bat file. In the latest ComfyUI zip file you will see the presets and the presets are
00:08:31 like this. So in this presets file open the which bundles download preset. And according to the
00:08:39 model preset that you want to use, let's say you want to use FLUX 2, download the necessary bundle.
00:08:45 We have bundles for major models. You see we have Z Image Turbo Core bundle, FLUX 2 Low Ram bundle,
00:08:53 FLUX 2 Core bundle, Qwen Image Core bundle. When you open them you will see all the models it
00:08:59 downloads. When you download these models you will have all the models downloaded for using
00:09:04 the preset. Or Wan 2.2 Core bundle. Moreover you can see which bundles covering which presets from
00:09:12 here. We support all the presets in SwarmUI except 2 presets which are outpainting and inpainting.
00:09:20 Because these two presets depending on the image editing of the SwarmUI. And we also have a full
00:09:27 tutorial for that. When you watch this tutorial, Qwen Image Models Realism, it shows you how to do
00:09:33 inpainting and outpainting. So I recommend to follow this video. For Qwen Image Edit
00:09:38 newer model follow this video. We have videos for everything. You can just leave a comment, message
00:09:44 me from Patreon or from Discord and I can show you which tutorial you should follow for which task.
00:09:50 So let's make a demonstration of the ComfyUI. So I will drag and drop my FLUX SRPO. And it
00:09:57 has selected the FLUX SRPO model. Currently my ComfyUI see all of my models inside SwarmUI and
00:10:05 ComfyUI folder. By the way when you download for ComfyUI, this is how you download the models. Go
00:10:11 to your ComfyUI installation, go to ComfyUI, go to models folder, copy the path like this. Paste the
00:10:18 path here. So you see you paste this models path, select this ComfyUI folder structure,
00:10:24 then the rest is same. Just click download and it will download all the models. For example
00:10:28 let's download Z Image Turbo Core bundle. It will start downloading right away with the fastest way,
00:10:34 with the hash verification. So all of my downloads are 100 percentage accurate. You see the speed is
00:10:41 100 megabytes per second. I will just cancel it since I have them. And you can also choose
00:10:47 different models from here. You may have issues with the VAE because the VAE are downloaded into
00:10:54 subfolders in SwarmUI. So just pick the VAE from here if you don't see it immediately.
00:11:00 But it will see the others. So that's it. You just type your prompt, you change your output
00:11:06 resolution. Currently it is set 1024. I just hit generate. Everything will work right away. I have
00:11:13 tested all the presets that I have here. So all of them is working 100 percentage with our installers
00:11:20 because some of the presets depending on some of the custom extensions. Therefore you have to
00:11:26 use our installer for ComfyUI and our ComfyUI installer supports Python 3.10, Python 3.11,
00:11:33 Python 3.12, and Python 3.13 with Sage Attention, Flash Attention, Triton, all the necessary
00:11:41 libraries. It works perfect on RTX 5000 series GPUs or all of the GPUs like 4000, 3000, 2000.
00:11:48 I support all of them. And the image has been generated like this. So this is how you use the
00:11:54 ComfyUI presets. They are all set. You don't need to do anything. You can download all the models
00:12:01 with our downloader. You can find which models are needed here. And the best part is you can use
00:12:08 SwarmUI with our ComfyUI installation as a backend and use everything in SwarmUI which I recommend.
00:12:14 Because SwarmUI is the best UI right now, most up to date UI right now to do everything. And we have
00:12:20 tutorials for everything like image editing or inpainting, outpainting, fixing errors, whatever
00:12:28 you can imagine for. And we have tutorials for all the trainings. Hopefully Trellis 2 tutorial
00:12:35 is coming and the video and image upscaler tutorial is coming. So these are the news today
00:12:41 I wanted to show you. Stay subscribed, ask me any questions you want. Hopefully see you later.
Beta Was this translation helpful? Give feedback.
All reactions