First Ever SDXL Training With Kohya LoRA - Stable Diffusion XL Training Will Replace Older Models #226
FurkanGozukara
announced in
Tutorials
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
First Ever SDXL Training With Kohya LoRA - Stable Diffusion XL Training Will Replace Older Models
Full tutorial: https://www.youtube.com/watch?v=AY6DMBCIZ3A
Updated for SDXL 1.0. How to install #Kohya SS GUI trainer and do #LoRA training with Stable Diffusion XL (#SDXL) this is the video you are looking for. I have shown how to install Kohya from scratch. The best parameters to do LoRA training with SDXL. How to use Kohya SDXL LoRAs with ComfyUI. How to do checkpoint comparison with SDXL LoRAs and many more cool stuff.
Source GitHub Readme File⤵️
https://github.com/FurkanGozukara/Stable-Diffusion/blob/main/Tutorials/How-To-Install-And-Use-Kohya-GUI-And-Do-Ultra-Realistic-SDXL-Training-Tutorial.md
Automatic Kohya Installer Script File⤵️
https://www.patreon.com/posts/for-runpod-kohya-84898806
Our Discord server⤵️
https://bit.ly/SECoursesDiscord
If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰⤵️
https://www.patreon.com/SECourses
Technology & Science: News, Tips, Tutorials, Tricks, Best Applications, Guides, Reviews⤵️
https://www.youtube.com/playlist?list=PL_pbwdIyffsnkay6X91BWb9rrfLATUMr3
Playlist of #StableDiffusion Tutorials, Automatic1111 and Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img⤵️
https://www.youtube.com/playlist?list=PL_pbwdIyffsmclLl0O144nQRnezKlNdx3
Learn step-by-step how to install Kohya GUI and do SDXL Stable Diffusion X-Large training from scratch. See example images of raw Stable Diffusion X-Large outputs after LoRA training. Find out how to optimize parameters, do checkpoint comparisons, generate hundreds of images, and use DeepFace AI to sort images by similarity. See how to leverage inpainting to boost image quality. Discover techniques to create stylized images with a realistic base. Get solutions to train on low VRAM GPUs or even CPUs. Resources include model links, prompt examples, parameter settings, workflows, and more. Whether experienced with AI image generation or just starting out, this in-depth tutorial will help take your Stable Diffusion skills to the next level.
Learn every step to install Kohya GUI from scratch and train the new Stable Diffusion X-Large (SDXL) model for state-of-the-art image generation. This in-depth tutorial will guide you to set up repositories, prepare datasets, optimize training parameters, and leverage techniques like LoRA and inpainting to achieve photorealistic results.
See examples of raw SDXL model outputs after custom training using real photos. Find out how to tune settings like learning rate, optimizers, batch size, and network rank to improve image quality and training speed. Learn to generate hundreds of samples and automatically sort them by similarity using DeepFace AI to easily cherrypick the best images.
Get solutions to train SDXL even with limited VRAM - use gradient checkpointing or offload training to Google Colab or RunPod. See how to create stylized images while retaining a photorealistic face using strength tuning and prompt engineering.
Whether you're experienced with AI image generation or just starting out, this extensive tutorial will take your Stable Diffusion abilities to the next level.
00:00:00 Introduction to SDXL LoRA training tutorial
00:02:50 How to install Kohya GUI trainer
00:04:58 How to start Kohya GUI trainer after the installation
00:05:35 Beginning to show all SDXL LoRA training setup and parameters on Kohya trainer
00:05:51 How to download SDXL model to use as a base training model
00:06:20 How to prepare training data with Kohya GUI
00:07:06 What is repeating parameter of Kohya training
00:07:42 How to set classification images and use which images as regularization images
00:08:52 How to prepare training dataset folders for Kohya LoRA / DreamBooth training
00:09:35 What are ohwx rare token and man class token
00:10:02 Why I am using real images as classification images for realism
00:10:35 Don't forget copy folders info
00:11:06 Best parameters for SDXL Kohya LoRA training
00:14:45 Explanation of Network Rank Dimension for LoRA
00:15:31 What is Network Alpha for LoRA training
00:16:18 How to save settings / configuration in Kohya GUI trainer
00:16:46 The importance and usage of print training command button
00:18:05 Why did I get out of VRAM error (OOM)
00:19:02 What does it / second means you see in Kohya trainer or on SD web UIs
00:19:55 Where to see number of steps it will take
00:20:35 Where I shared my training command and how you can use it
00:20:46 How you can do training with my settings
00:21:40 How to use trained SDXL LoRA models with ComfyUI
00:23:00 How to do checkpoint comparison with Kohya LoRA SDXL in ComfyUI
00:27:05 How to generate amazing images after finding best training checkpoint
00:28:03 How to utilize AI to find best generated images very easily
00:31:42 How to find best images in certain looking direction / pose
00:32:03 How to inpaint generated image to improve quality and similarity
00:34:18 How to do SDXL LoRA training if you don't have a strong GPU
00:35:10 How to get stylized images such as GTA5 style images of yourself
00:39:00 Please help me. Thank you so much
Video Transcription
00:00:00 Greetings everyone, in this tutorial, I will show you how to install Kohya GUI trainer
00:00:05 from scratch.
00:00:06 How to do SDXL Stable Diffusion X-Large Kohya LoRA training from scratch.
00:00:12 How to generate amazing quality images after doing LoRA training as you are seeing right
00:00:18 now.
00:00:19 Here two showcases for you.
00:00:21 On the left my real image.
00:00:23 On the right we see a raw output of Kohya LoRA on SDXL.
00:00:29 And here another one: on the left it is my real image, on the right we see Kohya LoRA
00:00:35 trained model generated image, this is also raw output.
00:00:38 Moreover, I will show you how to do LoRA checkpoint comparison to find the best checkpoint after
00:00:45 training.
00:00:46 I will also show how to do inpainting to improve generated faces from your trained LoRA model.
00:00:53 Moreover, I will show how you can generate stylized images like this even on a realistic
00:01:00 workflow.
00:01:01 Don't you worry that if you don't have a strong GPU, I will show you how to do training even
00:01:07 if you don't have a strong GPU.
00:01:09 But this is not all.
00:01:10 I will also show you how you can sort your generated images based on the similarity to
00:01:16 your training images.
00:01:17 That will make your job much much easier to find high quality images.
00:01:23 There isn't any workflow for SDXL yet.
00:01:27 I have done over 15 trainings to find some optimal parameters.
00:01:32 This is the first full tutorial that you will find for SDXL training.
00:01:37 So I have prepared very detailed GitHub readme file.
00:01:41 All of the instructions, links and commands that we are going to use will be shared here
00:01:46 and I will update this file as it be necessary.
00:01:50 So for to be able to use Kohya GUI and follow this tutorial, you need to have python and
00:01:56 git installed in your computer.
00:01:58 If you don't know how to install them, I have an excellent tutorial here.
00:02:02 The link is here.
00:02:03 This is the download link.
00:02:04 Watch this tutorial video and you will learn how to install and use python and git.
00:02:08 To be able to use SDXL LoRAs, currently we need ComfyUI.
00:02:13 I have prepared an amazing tutorial for that as well.
00:02:16 Currently it is not published but when you are watching this video, the video link will
00:02:20 be here.
00:02:21 Its readme file is already ready.
00:02:24 I also have an amazing tutorial for Automatic1111 web UI, how to install and use it.
00:02:29 Automatic1111 is also working on SDXL implementation.
00:02:34 So when you are watching this video, probably it will also have SDXL Stable Diffusion X-Large
00:02:40 support.
00:02:41 So you can also watch this video and learn it.
00:02:43 One more thing that you need to do is you need
00:02:51 So this is the Kohya GUI repo link.
00:02:53 Right click, copy link address, enter the folder where you want to clone and install.
00:02:58 I will clone it into my F drive.
00:03:00 So I opened a new cmd window here, git clone and this is the URL.
00:03:06 It is cloned.
00:03:07 This is installation on computer.
00:03:09 You see, now we have our cloned folder here.
00:03:12 All you need to do is find setup.bat file, double click it.
00:03:17 It will compose a new virtual environment so it won't affect your other installations
00:03:22 such as Stable Diffusion or other installation.
00:03:25 Ignore this message.
00:03:26 Wait until you get this screen, then select option 1, hit enter, select option 2.
00:03:33 Now we are using Torch version 2.
00:03:35 This is really important using Torch version 2.
00:03:38 Now here, you won't see the progress, but you will see whenever the currently installing
00:03:45 package has been completed.
00:03:47 So you need to wait patiently.
00:03:50 This process may take a lot of time depending on your internet speed.
00:03:55 Currently my internet speed is 100 megabits per second.
00:03:59 So you see it is downloading the necessary files.
00:04:02 Just patiently wait.
00:04:04 The installation is continuing.
00:04:05 You will see the messages as it progress like this.
00:04:08 Once all of the requirements have been installed, you will be asked several options as you are
00:04:14 seeing right now.
00:04:15 Select this machine, select no distributed training, select no for this option.
00:04:20 Do you want to run your training on CPU only?
00:04:23 Do you wish to optimize your script with Torch dynamo?
00:04:25 This is not available for Windows yet.
00:04:28 So select no.
00:04:29 Do you want to use deep speed?
00:04:31 Select no.
00:04:32 And what GPU do you want to use?
00:04:33 We will select all.
00:04:35 And select BF16 here if you have RTX 2000, 3000 or 4000 series.
00:04:41 If you have RTX 1000 series or below, select FP16.
00:04:45 So I will select BF16.
00:04:47 You will see other several options here.
00:04:49 Install bitsandbytes-windows or manually configure accelerate.
00:04:52 You don't need to do them.
00:04:54 Then our installation has been completed.
00:04:56 Just close this cmd window.
00:04:59 After installation has been completed, all you need to do is double clicking gui.bat
00:05:03 file and it will start Kohya GUI as you are seeing right now.
00:05:06 It will install necessary requirements if there are any missing ones and it has started.
00:05:12 So we need to open this URL, copy it and paste it into your browser like this.
00:05:18 So this is the Kohya GUI screen.
00:05:20 In this tutorial, I will show how to do training on SDXL.
00:05:24 Between SDXL and SD 1.5 version, only parameters changes.
00:05:30 However, the rest of usage is same.
00:05:33 So let's begin.
00:05:34 So first we will begin with selecting our model.
00:05:36 Click here, select custom.
00:05:38 We need to select the model.
00:05:39 When you click here, it will allow you to select model.
00:05:43 My model is already downloaded in here.
00:05:46 SDXL base 0.9 version like this.
00:05:49 This is SDXL model, so I am selecting this option.
00:05:53 If you don't know where to download or how to download SDXL base version, this is the
00:05:58 link of it.
00:05:59 When you open this link, you need to have Hugging Face account logged in and it will
00:06:05 ask you to accept research agreement.
00:06:07 Just fill it and accept it.
00:06:09 With more details, I have explained this process in ComfyUI for SDXL tutorial, so you should
00:06:15 watch it as well.
00:06:16 Okay, we did set our base training model.
00:06:19 Then we will go to tools and prepare our training data set.
00:06:23 There is deprecated tab here.
00:06:26 This is the tab where we will prepare our training folders.
00:06:30 The instance prompt will be ohwx.
00:06:32 The class prompt will be man.
00:06:34 Then I need to set my training images.
00:06:37 I have prepared my training images like this.
00:06:39 Each one of them is 1024 pixels and 1024 pixels.
00:06:44 This is not a very good training images data set.
00:06:47 Why?
00:06:48 I have repeating backgrounds, I have repeating clothing, therefore this is not optimal.
00:06:53 But for now, I will use this and hopefully I am planning to make a tutorial for a very
00:06:59 good training data set preparation.
00:07:01 So I will copy the path from here, paste it here.
00:07:04 Now repeating parameter.
00:07:06 This parameter is not very well known.
00:07:09 I have asked this to the developer of Kohya to explain in more details.
00:07:15 However, I still didn't get an answer.
00:07:17 From my understanding that it will do 20 times training of each training images in 1 epoch
00:07:27 with each time training with another 1 single class image.
00:07:31 It is really hard to explain with this way.
00:07:34 This is much more simpler and easier with the DreamBoothh extension of Automatic1111
00:07:38 web UI.
00:07:39 But for now, I will use this as 20.
00:07:42 Then we will set our classification / regularization images.
00:07:45 I have got an amazing classification images directory.
00:07:49 This is prepared from Unsplash.
00:07:51 How I have prepared these classification images?
00:07:56 I shared that on this YouTube tutorial video.
00:07:59 I also shared these classification images in this Patreon post.
00:08:03 This Patreon post contains all of these ratios previously prepared classification images
00:08:09 for you.
00:08:10 So let's copy its path.
00:08:12 This classification / regularization images will significantly improve our realism.
00:08:18 Because with these images, we will further fine-tune the Stable Diffusion X-Large model.
00:08:24 However, you don't have to use these images.
00:08:26 You can use ComfyUI, generate photo of man classification images and use them if you
00:08:31 wish.
00:08:32 Destination training directory.
00:08:33 This is important.
00:08:34 This is where all of the training files, LoRA outputs will be saved.
00:08:39 So click this icon, select the folder where you want your training to be saved.
00:08:43 I will make a new folder, Kohya SDXL tutorial files, and I will open this folder and select
00:08:51 it.
00:08:52 You see now it is written here.
00:08:54 Now click preparing data.
00:08:55 When I click this what will happen?
00:08:57 First of all, every time whenever you do something, you need to look to CMD window and see what
00:09:03 is happening.
00:09:04 You see it has copied my training images into this folder.
00:09:08 It has copied my classification images into this folder and everything is set.
00:09:13 So it is inside F drive, inside my new folder.
00:09:17 So this is the folder structure.
00:09:19 Inside img you will see a folder named like this.
00:09:22 This naming is extremely important.
00:09:25 Without this naming convention, Kohya script will not work.
00:09:29 Okay, this is really important.
00:09:31 And also in reg folder, we have regularization images like this.
00:09:34 You may be wondering what is this ohwx instance prompt?
00:09:38 What is this class prompt?
00:09:40 I am agree with you.
00:09:42 These are like alien terms if you are not familiar with this training.
00:09:46 If you want to learn more about this rare token, classification images, I have an amazing
00:09:51 tutorial.
00:09:52 This is a master tutorial.
00:09:53 It is 100 minutes long.
00:09:55 Watch this tutorial.
00:09:56 It also has chapters and English subtitles manually fixed by me.
00:10:00 Some people also asking me why you are using real images as regularization images.
00:10:06 Because if you are aiming realism, then using real images will make your model even further
00:10:12 realistic.
00:10:13 As long as the images you use for classification are better than the model itself.
00:10:19 Since these are very high quality real images, it will further fine tune SDXL for realism
00:10:26 and it will improve my realism training.
00:10:29 However, if you want to do training with the style of the base model, then you should use
00:10:33 the images generated from the model itself.
00:10:35 Okay, finally, don't forget copy info to folders tab because when we go back to training tab
00:10:42 and when we go to the folders tab, we will see all of the parameters like this.
00:10:47 Model output name.
00:10:48 Now this is important.
00:10:50 Whatever the name you give will be used when saving the checkpoints of LoRA training.
00:10:55 So I will name this as tutorial_video.
00:10:59 So this will be our output name of Kohya generated LoRA checkpoints.
00:11:04 Then the most important part parameters.
00:11:06 I have done over 15 trainings to find optimal parameters for SDXL.
00:11:13 Let me show you some of the tests that I have made until finding some optimal parameters.
00:11:18 I have done so many tests and each one of them is taking huge time as you can imagine.
00:11:24 So finding optimal parameters with Kohya LoRA is extremely hard.
00:11:28 There are so many conflicting information on the internet.
00:11:31 There isn't any proper information.
00:11:34 So with SDXL, you have to do a lot of research.
00:11:37 There are some presets for SDXL, but they weren't any good.
00:11:41 So I am not going to use any preset.
00:11:44 Select LoRA type Standard.
00:11:46 Do not change anything in these two sections.
00:11:49 Train batch size 1.
00:11:50 Now this is very important.
00:11:52 If you are teaching a subject using the minimum training batch size is better.
00:11:57 Because as you increase your training batch size, the generalization of the model will
00:12:01 decrease.
00:12:02 So when you need to use batch size?
00:12:03 You need to use training batch size if you have too many images to train.
00:12:08 If you are doing an overall fine tuning of a model.
00:12:10 Like let's say you have 100,000 images to train, then you have to increase your batch
00:12:16 size because otherwise it will take forever to train.
00:12:19 Number of epochs.
00:12:20 Now this is important.
00:12:21 Since we did set number of repeatings to 20.
00:12:24 Actually our 1 epoch will be equal to 20 epochs.
00:12:29 Because in each epoch, 20 times each one of the training images will be trained.
00:12:35 Therefore, I will do training up to 10 epochs.
00:12:39 With this way, we will have checkpoint with every 20 epochs.
00:12:44 Like I am showing in Automatic1111 web UI DreamBooth training.
00:12:48 So we will save every 1 epoch a checkpoint.
00:12:51 Select mixed precision BF16.
00:12:53 Select BF16.
00:12:54 If you get error, if you have older card that doesn't support BF16, then use FP16.
00:12:59 Number of CPU threads per core.
00:13:01 This is two.
00:13:03 I don't change it.
00:13:05 Cache latents.
00:13:06 This will improve your training speed.
00:13:07 Cache latents to disk.
00:13:08 This will also improve your training speed.
00:13:10 Now learning rate.
00:13:11 I have tested so many learning rates and for SDXL with standard LoRA type, the learning
00:13:18 rate I found best is 4e-4.
00:13:21 So this is equal to this one.
00:13:24 You see 0.0004.
00:13:26 Optimizer.
00:13:27 This is really important.
00:13:28 With SDXL, we are using Adafactor optimizer.
00:13:33 These optimizers are explained in this Hugging Face page.
00:13:37 You see AdamW and there is Adafactor.
00:13:40 So Adafactor uses much lesser VRAM.
00:13:43 I think this is the least VRAM using optimizer algorithm.
00:13:47 You can read this page and learn more if you wish.
00:13:50 Okay, so optimizer is Adafactor.
00:13:53 Actually, I just checked again and verified that my learning rate scheduler is constant
00:13:59 and learning rate warmup steps is 0.
00:14:01 So don't forget this.
00:14:02 Learning rate scheduler constant.
00:14:04 Learning rate warmup steps 0.
00:14:06 Optimizer extra arguments.
00:14:08 Now for SDXL, we need to use optimizer extra arguments.
00:14:13 I have shared the optimizer extra arguments in our GitHub readme file.
00:14:18 Copy them.
00:14:19 Paste them here.
00:14:20 So these are optimizer extra arguments.
00:14:21 Max resolution.
00:14:22 This is really important.
00:14:24 You need to make this 1024 and 1024.
00:14:28 Also, I set learning rate equal in Text Encoder learning rate and UNET learning rate as well.
00:14:34 So all of these learning rates are same as you are seeing right now.
00:14:38 Do not check this box.
00:14:39 I think when you check this box, it will not work.
00:14:42 Because in my experimentation, it wasn't working.
00:14:45 Check no half VAE for SDXL.
00:14:47 Network rank dimension.
00:14:50 Now some people are wondering what is this network rank dimension.
00:14:54 LoRA training is actually DreamBooth training, but the difference is that we are not training
00:15:00 the entire model itself.
00:15:02 We are training only some part of the model.
00:15:05 As you increase this network rank, you are training more part of the model.
00:15:10 Thus, you are able to learn more information, more details.
00:15:15 However, as you increase this network rank, it will use more VRAM and the checkpoint files
00:15:22 will have bigger size in your disk.
00:15:25 So this is the trade-off of network rank.
00:15:27 I am using 256 network rank for realism.
00:15:31 Network alpha.
00:15:32 There isn't very clear information regarding this, but as you increase this during the
00:15:37 training, the weight changes are stronger.
00:15:42 So as you increase this, actually it is amplifying the learning rate and this network alpha changes
00:15:48 according to the LoRA type that you are using.
00:15:51 For standard LoRA type, use network alpha 1 with these learning rates.
00:15:56 Do not increase it.
00:15:57 I have tested it.
00:15:58 If you increase it, it will get over-trained very quickly.
00:16:02 So network alpha is 1.
00:16:04 Network alpha will not change the size of your checkpoints.
00:16:09 It is just a parameter that is used during training when applying and changing the weights.
00:16:15 So these are all the parameters.
00:16:17 Before starting training, you should save your configuration.
00:16:20 How?
00:16:21 Click save as.
00:16:23 Select the folder where you want to save.
00:16:25 Currently, this is the last folder.
00:16:26 I will save it as test video now.
00:16:29 I will save it as test video.
00:16:30 Save.
00:16:31 You see, this is the file.
00:16:33 And whenever you click save, now all of the configuration will be saved here.
00:16:37 After that, you can open and load and all of the settings you have will be loaded.
00:16:42 When you also open cmd window, you will see save as command and save commands here as
00:16:46 well.
00:16:47 So before doing training, you can click print training command.
00:16:50 This is very useful because it will show you how many number of images are found in your
00:16:56 set folders like this, how many steps they will get.
00:17:00 All of the information, all of the commands will be printed here.
00:17:03 I will also copy this command and then hit train model.
00:17:07 Then it will execute this command.
00:17:09 You will also get that a matching Triton is not available.
00:17:13 Some optimizations will not be enabled.
00:17:15 This Triton library is not available on Windows yet.
00:17:20 Therefore, you will also get this message if you are on Windows.
00:17:24 On Linux, it is working.
00:17:25 Also, you will see using DreamBooth method.
00:17:27 Why?
00:17:28 Because LoRA is also DreamBooth, but this is optimized version of DreamBooth training.
00:17:33 Therefore, it is using lesser resources and trade off is lesser quality.
00:17:38 So it is going to start training.
00:17:39 Let me zoom in a little bit more.
00:17:41 Okay, it is loading checkpoint, using xFormers, caching images.
00:17:46 Since we have selected cache images in the disk as well, this caching is taking some
00:17:51 time.
00:17:52 A little bit increased VRAM usage at the moment.
00:17:55 And meanwhile, I am recording a video.
00:17:57 Since I am recording a video at the same time, my VRAM usage is higher than normal and GPU
00:18:03 usage is higher than normal.
00:18:05 I did get out of VRAM error because I didn't enable gradient checkpointing and looks like
00:18:11 even 24 GB VRAM is not sufficient without gradient checkpointing.
00:18:16 So after making this change, I click save.
00:18:20 So the settings are saved.
00:18:21 This is the current VRAM usage 1.2 GB.
00:18:25 Let's do the training again.
00:18:27 You don't need to do anything else.
00:18:28 Just go to the bottom and click train model again.
00:18:31 And it will start training again.
00:18:33 And let's see the VRAM usage this time.
00:18:35 I think even when I am recording video now, it should work.
00:18:39 I also closed some of the applications that were open.
00:18:40 Okay, you see this is the VRAM usage this time.
00:18:45 So from 1.2 GB to 19.1 GB, 20.7 GB.
00:18:52 Currently, it is using about 19 GB VRAM.
00:18:56 The training has started.
00:18:58 So the training speed is 1.5 seconds per it right now.
00:19:02 It means iteration and in each iteration, it will train the number of images that you
00:19:09 did set in your training batch size.
00:19:11 So if my training batch size were 10, it would be training 10 images in each iteration.
00:19:18 But since it is 1, it is processing 1 image in 1.5 seconds right now.
00:19:24 And how many iteration it needs?
00:19:25 It needs 5200 iterations to be completed.
00:19:30 Why?
00:19:31 Because we are doing 10 epoch, we have 13 images, we are using classification images.
00:19:36 And when we do the calculation, it will be like this: 13 base training images multiplied
00:19:42 with 2.
00:19:43 Because we are using classification images.
00:19:44 Multiplied with 10.
00:19:46 Because we are doing 10 epoch.
00:19:48 Multiplied with 20.
00:19:49 Because we have 20 repeating.
00:19:51 Therefore, the total number of steps is 5200.
00:19:55 This calculation is also printed on the beginning when you hit the print training command.
00:20:02 As you see, it is showing you the calculation of number of total steps.
00:20:06 If my batch size were 13 how many steps / iteration would it take?
00:20:11 We will just divide it to 13.
00:20:14 And it will take only 400 steps.
00:20:17 Because in each step in each iteration, it would be processing 13 images instead of 1
00:20:23 image.
00:20:24 With these settings, the training is taking about 100 minutes on my computer.
00:20:28 I already done it.
00:20:29 So I will now open it.
00:20:31 But before doing that, I will show you one another very cool trick.
00:20:35 I shared my user training command in here, copy it, open notepad++ or any notepad editor,
00:20:42 paste it, and you will see all of my training commands here.
00:20:46 You can change the folder names according to yours.
00:20:49 And then you can copy this command.
00:20:52 And how are you going to execute it?
00:20:54 Enter inside your Kohya installation, enter inside virtual environment, enter inside scripts,
00:20:59 open a new CMD, then type activate when you're inside scripts, then move back to the main
00:21:06 Kohya folder like this, and copy paste the command and hit enter.
00:21:11 This will be exactly same as training from the GUI.
00:21:16 So it will start training from here with the same settings that we used in the GUI.
00:21:22 So you can also follow this strategy if you wish.
00:21:26 This will also work.
00:21:28 In here, you will also see all of the settings that is used in my training.
00:21:33 This is extremely convenient to check out and use.
00:21:36 You'll see it is starting the training exactly as it was in the GUI version.
00:21:41 So now I will open my Comfy UI to start using the LoRAs.
00:21:46 Okay, it is started.
00:21:48 You need to have workflows to start using SDXL with Comfy UI.
00:21:53 Everything you need with Comfy UI is explained in this tutorial video.
00:21:58 It is recorded but not posted on the YouTube yet.
00:22:00 When I click here, I will get to the GitHub readme file that I have prepared for it.
00:22:06 And in here we have the workflows.
00:22:09 I will start with SDXL with LoRA workflows.
00:22:12 Save link as.
00:22:13 By the way, I have explained everything in the upcoming video.
00:22:18 So you should watch it before watching this one.
00:22:20 Let's download the PNG file into our folder as base, open it like this, then return back
00:22:28 to our ComfyUI and drag and drop.
00:22:31 And the workflow is loaded with SDXL base.
00:22:34 Here my LoRAs.
00:22:36 Test9 LoRA is the LoRA that I generated with the same settings that I just shown you.
00:22:44 So you see I have several checkpoints.
00:22:47 And now I need to test checkpoints.
00:22:49 By the way, in the test9, I had user 25 repeating, not 20.
00:22:54 So I have two lesser checkpoints with 8 epochs, but it doesn't matter much.
00:22:59 So first of all, you need to test different checkpoints to find the best output.
00:23:05 And how are you going to do that?
00:23:07 Define your prompt.
00:23:08 So I have shared the prompt in the GitHub readme file.
00:23:12 Copy the positive prompt.
00:23:14 This is a prompt that I did come up with like this.
00:23:17 Then copy the negative prompt from here and paste it here.
00:23:22 This is for generating your image in a suit in a very realistic way.
00:23:28 Of course, I can't say this is the best way.
00:23:30 But this is a decent way.
00:23:32 Then we will test each one of the checkpoints.
00:23:35 Before doing this test, let's make this as fixed seed.
00:23:40 Let's make the batch size 6.
00:23:42 I think I can even do 6 batch size, then give a file name prefix.
00:23:47 Let's say tutorial video.
00:23:49 Okay.
00:23:50 Actually, if I want to see them with the file, and if I begin from the test9, the first checkpoint
00:23:59 like this.
00:24:00 I can give it as a name tutorial video 1.
00:24:03 So I will know this is the first checkpoint of testing.
00:24:07 Okay, fixed seed.
00:24:09 These are the parameters.
00:24:11 Everything you see here, explained in the other video, and it will be published when
00:24:15 you are watching this video.
00:24:17 Okay, we are ready and just hit queue.
00:24:20 So it will start processing images, generating the images of training with the first checkpoint.
00:24:26 Then let's pick the second checkpoint like this.
00:24:29 The first checkpoint was actually being 25 epoch because repeating was 25.
00:24:34 Okay, this one will be tutorial video 2.
00:24:37 Queue.
00:24:38 This is the second checkpoint.
00:24:39 Now it will generate the second checkpoint.
00:24:41 And let's see the third checkpoint.
00:24:43 Let's name it as like this.
00:24:45 Queue.
00:24:46 Okay, the fourth checkpoint.
00:24:48 This is much easier in Auto1111 web UI.
00:24:51 However, in Comfy UI, I couldn't find any better way.
00:24:55 I asked also the Comfy UI developers and there weren't any x/y/z checkpoint comparison.
00:25:00 So the fifth checkpoint, let's make this as fifth.
00:25:03 Queue, and then the sixth checkpoint.
00:25:06 Let's.
00:25:07 Oh, by the way, we didn't make the previous one.
00:25:11 Yeah, let's let's delete the previous queue.
00:25:13 So in the view queue, delete the last queue, let's go return back to five.
00:25:18 Let's make this five.
00:25:20 Add queue, then select the checkpoint six.
00:25:23 Let's make prefix as six, hit queue, then let's select seven, make the prefix as seven.
00:25:30 Hit queue.
00:25:31 And the last one you see, like this.
00:25:34 It is the final checkpoint after the training has been completed.
00:25:38 And let's make it as 8.
00:25:41 So this is actually being the eight epoch and repeating was 25.
00:25:45 So this is being equal to 200 epochs in Auto1111 training and hit queue.
00:25:51 Now I just need to wait patiently.
00:25:53 Let's look at the results.
00:25:54 So the results will be saved inside ComfyUI, inside output.
00:26:00 These are all of the images that I generated yesterday.
00:26:02 I will also show you them.
00:26:05 Let's sort by date modified.
00:26:07 Okay, we start seeing them here.
00:26:10 Tutorial video 1.
00:26:12 This is the first checkpoint generated images.
00:26:14 So the checkpoints images are generated.
00:26:17 It is now time to compare them and find the best looking checkpoint.
00:26:22 This part is totally subjective and totally depends on you.
00:26:26 You have to compare each checkpoint and find the best looking picture.
00:26:30 Unfortunately, this is not as easy as using Automatic1111 web UI X/Y/Z checkpoint comparison.
00:26:36 But this is what we got right now.
00:26:40 So you need to look for each one of the checkpoint and decide which one is looking like best.
00:26:47 Even the last checkpoint is not over trained.
00:26:49 I can see that.
00:26:50 So look for each checkpoint and decide which one is looking like best.
00:26:56 After looking each checkpoint, I think the checkpoint 6 is looking very well.
00:27:02 I will go with this one checkpoint 6.
00:27:04 So now what we are going to do is we will select our best checkpoint from here.
00:27:09 We will decide our prompts and generate hundreds of images to find the best ones.
00:27:15 How we are going to do that?
00:27:17 You see: I have selected my checkpoint checkpoint 6, then click extra options, increase batch
00:27:24 count.
00:27:25 Let's generate 100 white suits.
00:27:26 Queue prompt.
00:27:28 Then let's generate 100 blue suits.
00:27:30 After you see the queue size reaches 100 wait to it and then hit queue prompt and 100 blue
00:27:38 suit prompt is also queued and add to the queue the prompts that you like that you want
00:27:45 to generate.
00:27:46 After all images are generated what you are going to do is extremely important.
00:27:51 You can look all of the images and find the good ones.
00:27:56 However, this is very tiresome.
00:27:58 This is very hard to do.
00:28:01 So what else you can do?
00:28:03 I have an amazing tutorial how to find best Stable Diffusion generated images by using
00:28:08 deepface AI.
00:28:09 This is the tutorial link.
00:28:11 The used script is shown in the video.
00:28:14 I also shared the script in this Patreon post.
00:28:17 So go to this Patreon post, download the findbestimages.py file.
00:28:22 So the file is here.
00:28:24 Copy all of the generated images, put them into any folder.
00:28:28 Sorted images tutorial like this.
00:28:30 Paste it there.
00:28:31 Okay, all images are here.
00:28:33 They are not sorted with the similarity.
00:28:36 Then go to your training images data set and select one image that you would like them
00:28:42 to be sorted.
00:28:43 The full logic of this script is explained in this video.
00:28:48 So I will make two sorting.
00:28:50 The first one will be according to the similarity of this picture.
00:28:56 I copy this picture, then let's name it as org image 1.
00:29:00 Okay, this will be the folder.
00:29:03 Copy its path, edit the findBestImages.py file.
00:29:07 So I will make the original image to be compared like this.
00:29:11 Give the path of sorted images like here and give any name for detected images.
00:29:16 Then open command line, type python findBestImages.py file.
00:29:22 It will sort all of the images according to the picture you picked.
00:29:27 Let's see the sorting progress.
00:29:29 For this to work all you need is python installation and several other dependencies.
00:29:34 Everything is explained in this video.
00:29:38 It is so simple.
00:29:39 You see currently it is calculating the similarity between this picture and the pictures that
00:29:45 I am using here.
00:29:47 So what will this script will do is: it will sort the images according to the similarity.
00:29:52 Therefore, it will be very easy for me to compare them and find the best looking ones.
00:29:58 I already have sorted images according to the two different images.
00:30:02 One with this image and one with this image.
00:30:05 So let's see first the images that is similar to this one.
00:30:09 The sorted images are here.
00:30:11 So these images are sorted according to the similarity of this image.
00:30:16 So you see the similarities above 85% up to 90%.
00:30:21 Not all images will be perfect.
00:30:23 Because the script is not able to consider the beauty of the eyes, but it looks for overall
00:30:28 general similarity.
00:30:30 For example, this one.
00:30:31 This one is a very decent image.
00:30:33 Okay now let me show you some examples.
00:30:35 On the left: our real image, real training image on the right, our raw image as you are
00:30:39 seeing.
00:30:40 Here, another one.
00:30:42 Okay here another image according to the similarity, it is really really good.
00:30:46 And this is with SDXL 0.9 version, not even the official 1.0 version release.
00:30:52 We are not using the refiner.
00:30:54 I am pretty sure there will be also refiner training and we will get amazing quality with
00:31:00 SDXL.
00:31:01 I am pretty sure of it.
00:31:02 Let me also show you the other sorted images folder.
00:31:06 So with my script it will be very easy for you to find the good looking images without
00:31:12 looking through thousands of images.
00:31:14 It will help you tremendously.
00:31:16 So this one you see I am looking at this side.
00:31:19 Why?
00:31:20 Because when sorting them I used this training image, you see I am looking to the side direction.
00:31:26 Therefore now I am finding all of the images that I am looking for the same direction.
00:31:32 The quality is really really good.
00:31:35 It is finding the images really really well.
00:31:38 They can be further improved with inpainting.
00:31:41 I will also show that.
00:31:43 So if you have a specific pose, you can use that specific pose to sort the images and
00:31:50 find specific pose having images.
00:31:52 For example, this one really good one or this one.
00:31:56 And how you can further improve the quality and the similarity.
00:32:00 For example, let's take this picture and inpaint it.
00:32:04 For inpainting I will refer to this tutorial.
00:32:07 I have the inpainting workflow here.
00:32:09 Right click SDXL LoRA inpaint.
00:32:11 Save link as download anywhere you wish.
00:32:14 Open folder, then go to Comfy UI.
00:32:17 Load the inpainting like this.
00:32:19 Also, let's open the sorted image so we will have the seed value and the prompt.
00:32:24 So this is the prompt.
00:32:25 Okay I copy paste the prompt.
00:32:27 Let's also find the seed value.
00:32:29 It is here.
00:32:30 This is the seed value.
00:32:32 I copy paste the seed value and I also need to select this safetensors file.
00:32:37 So the safetensors file was 9.
00:32:40 So this is the safetensors file.
00:32:42 Test9, sixth epoch.
00:32:44 Okay I have it.
00:32:46 Let's make the batch size 1.
00:32:47 We also need to choose upload inpainting file.
00:32:50 So let's copy this file into here.
00:32:53 It will be easier to find.
00:32:55 Choose upload.
00:32:56 Let's go here.
00:32:57 This is the file I have right, click and here go to the open in mask editor, change the
00:33:04 thickness and select the face.
00:33:05 Like this.
00:33:07 This is not as easy as using in Auto1111 unfortunately.
00:33:10 This.
00:33:11 Okay save to node.
00:33:12 Okay, grow mask by 64 pixels.
00:33:16 The denoise is 80%.
00:33:19 Okay, let's add to the queue.
00:33:21 By the way, this mask may not be very well so let's see the result.
00:33:26 You need to play with denoise strength and mask to get good results, the better results.
00:33:31 We will see the generated image here in a moment.
00:33:34 Okay with inpainting currently 1.6 it per second.
00:33:38 I think Auto1111 will be much faster than this.
00:33:42 And we have inpainted image.
00:33:44 It is named as base output.
00:33:46 Let's save the image.
00:33:47 Okay it is saved.
00:33:49 And here in painted image this was the original image.
00:33:52 This is the inpainted image.
00:33:54 Of course we need to do more inpainting to get better result.
00:33:57 But I already see some serious improvements.
00:34:01 Maybe the.
00:34:02 Yeah since we have made the mask bigger I think it increased the face size so it may
00:34:09 not look very natural.
00:34:11 But this is the strategy of obtaining very high quality images with inpainting of SDXL.
00:34:18 So what if if you don't have 24GB GPU.
00:34:21 You have 8 GB.
00:34:23 You can do the training on RunPod.
00:34:25 I already have tutorial for how to install and use on RunPod.
00:34:29 I already have automatic installer script for RunPod.
00:34:33 Do the training on RunPod.
00:34:36 Everything is same.
00:34:37 Watch this video.
00:34:38 You will learn.
00:34:39 Then download the generated checkpoints and use your own computer.
00:34:42 What if if you don't even have any gpu, you can use RunPod both for ComfyUI and for training.
00:34:51 Also in my referred tutorial I show how to use ComfyUI on Google Colab free account as
00:34:56 well.
00:34:57 So it will be very easy for you to use.
00:35:00 Just watch the other tutorial that will be linked here when you are watching and then
00:35:05 watch this tutorial and it will help you tremendously.
00:35:07 So we got amazing realistic quality.
00:35:11 But what about styling?
00:35:15 Realism and styling are actually conflicting because for realism you need to have more
00:35:19 details.
00:35:20 That means that it is lesser generalized model.
00:35:24 However, let's try this.
00:35:25 Portrait photo of ohwx man in gta 5 style.
00:35:30 Let's queue prompt.
00:35:31 Let's drag and drop the image output here so we can test it.
00:35:35 Currently it is going to generate 2 images.
00:35:37 We can see the progress here.
00:35:39 It is still loading.
00:35:40 Yes, the model is loaded.
00:35:42 It is generating 2 images with 1.15 second per it.
00:35:46 By the way, these are not the best it per second because I am recording video.
00:35:50 Moreover, I think there will be more optimizations soon.
00:35:54 Okay, we got the result.
00:35:57 You see it is still not good enough stylized.
00:36:00 So what you need to do is reduce the strength of the ohwx token.
00:36:06 So let's try like this with 90 percent strength instead of 100 percent for this ohwx token.
00:36:13 And here the results.
00:36:14 We can see that it is starting to look more like the GTA5 as you are seeing right now.
00:36:21 When I double click on the images it shows them.
00:36:25 You see it is becoming more like GTA5 style.
00:36:27 We can add more prompts like a game character like this, perhaps digital drawing, digital
00:36:35 artwork.
00:36:37 You see, you know when you add more prompts it will become more like that.
00:36:41 Let's try like this, but certainly there is some stylizing as a GTA5.
00:36:46 Okay, now we are more like a game character.
00:36:49 A digital drawing.
00:36:51 You see.
00:36:52 Let's reduce the strength of the ohwx token and try one more time.
00:36:57 But certainly we are getting closer to a game character style as you are seeing right now.
00:37:03 It is still perfectly keeping my face.
00:37:06 This is a thing that I have discovered while doing experimentation.
00:37:10 There will be whole new area of SDXL prompting because it is different than SD 1.5 version.
00:37:17 That is for sure.
00:37:19 We need to figure out how to do prompting.
00:37:22 And here now we are seeing almost completely as a GTA5 character based on my picture.
00:37:29 This is another one.
00:37:30 This is not as much as stylized like that, but this one is certainly.
00:37:34 So this is the way of getting stylized images.
00:37:38 If your image is too realistic, reduce the weight here and hopefully I will make a new
00:37:43 tutorial for how to get amazing stylized images.
00:37:47 It requires different workflow that is for sure.
00:37:51 I have got some other previous testing as well.
00:37:54 You see this is another stylized image.
00:37:56 Let's load also this one to see the prompt.
00:37:59 Photo of ohwx man as a pixar character.
00:38:02 This was using my previous safetensors file with strength of the LoRA model reduced.
00:38:09 So I will copy this prompt.
00:38:12 So I will open another tab here.
00:38:15 Okay, load the latest generated image, copy the prompt like this, copy the negatives like
00:38:21 this and try again with our best checkpoint.
00:38:24 Then I will reduce the token strength to see.
00:38:28 Okay, here we got the result.
00:38:30 Now let's reduce the weight of the token like this and try again.
00:38:35 Okay, now we got a very good stylized image with 90 percent strength.
00:38:40 You see it is completely stylized as Pixar as me and this is not a cherry pick.
00:38:46 This is the first try I did make so this is amazing.
00:38:50 I added these prompts to our readme file as well.
00:38:53 So as I said, this readme file will be your number one source for this tutorial.
00:38:58 You see the prompts are here.
00:39:00 This is all for today.
00:39:02 I hope you have enjoyed.
00:39:03 Please watch the other video that you will see here that will explain you how to use
00:39:09 ComfyUI.
00:39:10 It will be tremendously important for you to follow this tutorial.
00:39:14 Please also join me on Youtube and support me on Youtube.
00:39:18 I would appreciate it very much.
00:39:20 Please subscribe, like, leave a comment.
00:39:22 Leave a comment and tell me the prompts that you have discovered.
00:39:26 The prompting ideas.
00:39:27 I will also add them to the readme file.
00:39:29 It will be very useful for others.
00:39:31 Please also support me on Patreon.
00:39:33 Your Patreon support is extremely important for me.
00:39:38 Without your support I won't be able to continue generating this high quality tutorials, resources
00:39:44 for you.
00:39:46 Because my Youtube views are not good.
00:39:48 So I need your help with this one.
00:39:51 Without Youtube views I am generating very little amount of revenue so your Patreon support
00:39:55 is what is keeping me to continue.
00:39:56 Hopefully see you in another amazing tutorial.
Beta Was this translation helpful? Give feedback.
All reactions