Epic Web UI DreamBooth Update - New Best Settings - 10 Stable Diffusion Training Compared on RunPods #284
FurkanGozukara
announced in
Tutorials
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Epic Web UI DreamBooth Update - New Best Settings - 10 Stable Diffusion Training Compared on RunPods
Full tutorial: https://www.youtube.com/watch?v=sRdtVanSRl4
RunPod: https://bit.ly/RunPodIO. Discord : https://bit.ly/SECoursesDiscord. Web UI DreamBooth got epic update and we tested all new features to find the best settings (10). If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰 https://www.patreon.com/SECourses
Playlist of #StableDiffusion Tutorials, #Automatic1111 and Google Colab Guides, #DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img:
https://www.youtube.com/playlist?list=PL_pbwdIyffsmclLl0O144nQRnezKlNdx3
The experiments are done on RunPods but if you have 20 GB vram having Windows PC, you can exactly do the same things. All you need to do is installing latest Automatic1111 and DreamBooth extension.
Where you can find executed commands and prompts and more info:
https://gist.github.com/FurkanGozukara/ad1aa55e64576d49f829450278a15885
2400 Photo Of Man classification images:
https://drive.google.com/file/d/1qBf8VyUbmPNalKqm076yOsQjE8BrcG7R/view
DreamBooth extension repo link:
https://github.com/d8ahazard/sd_dreambooth_extension
Easiest Way to Install & Run Stable Diffusion Web UI on PC by Using Open Source Automatic Installer:
https://youtu.be/AZg6vzWHOTA
How to use Stable Diffusion V2.1 and Different Models in the Web UI - SD 1.5 vs 2.1 vs Anything V3:
https://youtu.be/aAyvsX-EpG4
Zero To Hero Stable Diffusion DreamBooth Tutorial By Using Automatic1111 Web UI - Ultra Detailed:
https://youtu.be/Bdl-jWR3Ukc
How To Do Stable Diffusion Textual Inversion (TI) / Text Embeddings By Automatic1111 Web UI Tutorial:
https://youtu.be/dNOpWt-epdQ
Sketches into Epic Art with 1 Click: A Guide to Stable Diffusion ControlNet in Automatic1111 Web UI:
https://youtu.be/vhqqmkTBMlU
Ultimate RunPod Tutorial For Stable Diffusion - Automatic1111 - Data Transfers, Extensions, CivitAI:
https://youtu.be/QN1vdGhjcRc
Transform Your Selfie into a Stunning AI Avatar with Stable Diffusion - Better than Lensa for Free:
https://youtu.be/mnCY8uM7E50
Stable Diffusion Google Colab, Continue, Directory, Transfer, Clone, Custom Models, CKPT SafeTensors:
https://youtu.be/kIyqAdd_i10
Fantastic New ControlNet OpenPose Editor Extension & Image Mixing - Stable Diffusion Web UI Tutorial:
https://youtu.be/iFRdrRyAQdQ
00:00:00 Introduction to DreamBooth new update best settings experiments
00:00:49 How to setup a new Pod and install DreamBooth newest update properly
00:02:25 New RunPod started time to setup and install DreamBooth
00:03:20 Install DreamBooth extension manually and fix errors
00:07:15 Starting first experiment test0 - setup of settings
00:07:35 Best DreamBooth settings for 12GB VRAM having GPUs
00:11:05 Setting and starting second experiment test1 DEIS Noise Scheduler
00:11:55 Test2 Unfreeze Model
00:12:12 Test3 Lion Optimizer
00:12:40 Test4 Stable Diffusion Offset Noise
00:13:17 Test5 Freeze Clip Normalization Layers DreamBooth
00:13:33 Test6 Use EMA + Use EMA for Prediction
00:14:37 Test7 Use EMA + Use EMA Weights for Inference
00:14:55 Test8 Use EMA only
00:15:05 Solution of configuration index out of range Web UI error
00:15:18 Test9 Don't use xformers - default memory attention and fp16
00:15:49 How to add more disk space to your existing RunPod
00:17:08 xformers related bug error
00:18:20 How to continue DreamBooth training if an error occurs or for any reason halted
00:18:49 All tests have been completed time to check their training samples
00:18:58 Test0 training samples - previously known best settings for 12GB VRAM
00:19:25 Test1 training samples - DEIS Noise Scheduler
00:19:44 Test2 training samples
00:20:20 Test3 training samples
00:21:13 Test4 training samples
00:21:38 Test5 training samples
00:22:15 Test6 training samples
00:23:27 Test7 training samples
00:24:19 Test8 training samples
00:24:54 Test9 training samples
00:25:36 Finding a good seed to compare all checkpoints within each trained model
00:26:46 What seed and prompt I used to compare checkpoints
00:27:50 What is the logic of starting seed when using batch image generation
00:28:14 How to use x/y/z plot to compare checkpoints to find best trained model
00:29:10 Where to find x/y/z plot generated grid image file
00:29:40 Comparing generated grid files of all experiments
00:29:44 Test0 Checkpoints Grid
00:31:26 Test1 Checkpoints Grid DEIS Noise Scheduler
00:32:50 Test2 Checkpoints Grid
00:34:48 Test3 Checkpoints Grid
00:38:19 Test4 Checkpoints Grid
00:40:30 Test5 Checkpoints Grid
00:41:25 Test6 Checkpoints Grid
00:43:14 Test7 Checkpoints Grid
00:44:00 Explanation of overtraining by a comparison
00:45:27 Test8 Checkpoints Grid
00:46:59 Test9 Checkpoints Grid
00:48:30 How to download all decided best checkpoints via runpodctl
00:49:09 What is RunPod connect to web terminal and how to use it when jupyter connection is not available
00:49:56 Where to put downloaded safetensors model files
00:50:10 Using x/y/z plot do conduct final comparison experiment on my local web UI
00:50:40 Test7 model file size is smaller than others
00:51:00 Each model file sizes
00:51:05 Comparing all of the experiments test0 vs test1 vs test2 ...
01:00:20 Ending Speech
Video Transcription
00:00:00 Greetings everyone. DreamBooth extension of Automatic1111 Web UI got a major update
00:00:04 very recently. Interface has been changed and many new features are added. In this video,
00:00:09 I am going to test all these new features on RunPod io pods simultaneously. Moreover,
00:00:14 I will show you how you can properly install and run newest DreamBooth version on the pods. All
00:00:20 of the features of DreamBooth I am going to explain in this video applies to the
00:00:24 Automatic1111 Web UI PC version as well. We are going to conduct 10 different experiments on 10
00:00:30 different pods as displayed here to find out which features improves our training quality.
00:00:36 My pods are ready for testing. As you can see here, I have literally spent like 5
00:00:41 hours yesterday to prepare these pods, solve the new problems that I have encountered. But first,
00:00:48 let's begin with setting up a new pod, showing you how you can properly install DreamBooth extension.
00:00:54 I am picking my new pod with having over 60GB RAM. This is important when you are doing DreamBooth
00:01:02 training. I am selecting Stale Diffusion 1.5 version template. I am selecting 5GB temporary
00:01:09 disk, 150GB persistent volume. This is up to you. You can increase this persistent volume
00:01:14 or temporary disk later time. However, you can't decrease persistent volume later. So be careful
00:01:19 with that. So the things we are going to do on RunPod Stale Diffusion 1.5 applies to the 2.1
00:01:25 version template as well. If you don't know what is Stale Diffusion, how to use Stale Diffusion,
00:01:30 what is Automatic1111 web UI, how to install it on your computer, I have excellent video
00:01:35 tutorial series as you can see right now. So in the first two video, I am showing how you
00:01:40 can use DreamBooth on Google Colab for free. In these two videos, I am showing how you can
00:01:46 install and run Automatic1111 web UI on your PC. In this video, I am explaining what is DreamBooth,
00:01:52 how does it work. So this is a major video that you should watch for learning DreamBooth training.
00:01:57 This is my another major video that you should watch to learn textual inversion. You can look all
00:02:02 other videos in here. I have videos for ControlNet and I have ultimate tutorial for RunPod IO. This
00:02:09 is really important if you don't know how to use RunPod. And this is my another very recent video
00:02:13 about DreamBooth training on RunPod. But now today we are going to make another one because there is
00:02:20 a major update of DreamBooth. So our new pod is started. Let's connect with connect to Jupyter
00:02:25 Lab. When you install a fresh RunPod, it already starts an instance of Automatic1111 web UI. You
00:02:32 can see the instance started here. However, this instance command line interface, the terminal you
00:02:38 can't see. So therefore, we are going to make two things. The first thing we are going to make is
00:02:43 preventing relauncher to continuous launching so that when we kill that initially started instance,
00:02:49 it won't relaunch again. All we need to do is changing this forever running while loop as
00:02:55 (n < 1). So after one time trying relaunching, it won't relaunch again. The second thing we
00:03:00 need to do is open web UI dash user.sh file and add dash dash share command. So with this way,
00:03:07 we will be able to access our Automatic1111 web UI instance from a Gradio link on our
00:03:13 browser like we are using it on our computer. No difference. Now I will restart pod for these to
00:03:18 be effective. To restart the pod, click here and click restart pod. Pod has been restarted. We are
00:03:24 going to install DreamBooth extension manually, not through the web UI interface. To do that,
00:03:29 go to the extensions folder and in here open a new launcher, new terminal from here,
00:03:34 copy the URL of the DreamBooth extension, type here git clone and paste the URL. It will clone
00:03:40 the DreamBooth extension. If you already have the extension, enter inside the extension folder,
00:03:45 open a new launcher here, type git pull command like this. It will pull the latest version for you
00:03:51 git pull. Once you have done this, we are going to install the requirements manually. For installing
00:03:57 requirements. We are going to run this command copy, paste it. So the command is pip install
00:04:03 dash dash upgrade dash dash no deps dash dash force reinstall dash R requirements txt. It will
00:04:09 install all of the requirements for DreamBooth extension. This is very convenient because it
00:04:14 is going to also install the newest xformers version so it will work out of box. You see,
00:04:19 it is attempting to uninstall Xformers then it is going to install the latest version of xformers
00:04:24 and in the end you will see successfully installed Accelerate requirements version and other things
00:04:30 as you can see. Are we done? No, unfortunately not. So I am opening requirements txt here.
00:04:35 Then I will open the requirements txt of the Automatic1111 web UI as well requirements versions
00:04:42 actually. So there are some conflicting versions of Automatic1111 web UI and the DreamBooth.
00:04:48 The very important one is Accelerate. You see DreamBooth is using 0.16 however, Automatic1111
00:04:54 currently using 0.12 therefore, you need to change the conflicting versions of Automatic1111 web UI
00:05:02 requirements to the same level as the DreamBooth. To do this, you can manually check each one of
00:05:07 the requirements in each file and compare them whether they are existing and conflicting or not.
00:05:12 Okay I see git python 3.1.27 and yes, this is a newer version in DreamBooth. So I will update the
00:05:20 version in Automatic1111 web UI requirements. Why I am doing that because when you launch
00:05:25 web UI it is overriding back to the lower version of requirement and that causes your DreamBooth to
00:05:32 fail. Okay I see also transformers is conflicting. So I am updating it and it is all done. So we
00:05:37 have changed it Accelerate, git python and the transformers version. That's it. Now we are ready
00:05:43 to relaunch our web UI instance. To do that first we are going to kill the instance that is already
00:05:49 running which is starting when you first time run the pod with this command fuse r-k the port of the
00:05:56 instance which is 3000 by default and slash tcp. You see it is killed I am seeing the message here.
00:06:03 Then you need to run relauncher.py with python relauncher.py it will run and we will have our
00:06:10 fully working DreamBooth extension. Okay after web UI if you get this error, this is caused by
00:06:16 the Tensorflow installation. All you need to do is running this command. After you run this command,
00:06:22 it will work. Okay there is one final thing that I had to do. You see if you get this error. After
00:06:29 all this, let me show you the error that cannot add middleware after an application has started,
00:06:35 you need to run one additional command which is pip install dash dash force fast api equal equal
00:06:42 0.90.1. So unfortunately it is installing the latest version on a new RunPod as 0.92.0 however,
00:06:52 this is not working so we need to downgrade it and we are using this command to downgrade it.
00:06:57 Additionally, if you encounter any problem while launching it, join our discord channel. The link
00:07:02 will be in description and also in the comments and ask me your encountered problem. Let me show
00:07:08 you the freshly installed DreamBooth so you see, we have the DreamBooth tab with all of
00:07:13 the necessary versions and it is ready to work. So i'm going to start composing my first test. When
00:07:19 composing a new model, make sure that you have selected your source checkpoint, give it a name,
00:07:25 and then create model. Base test test0 is the same settings as previously known as the best settings.
00:07:32 Okay, now I will show you the best settings I am using. I mean as a base settings that were best
00:07:38 previously. So none of these are checked training steps per image epochs: this is up to you. I am
00:07:44 training up 200 epochs in this experiment. I will save every 10 epochs a model checkpoint. I will
00:07:50 save preview images every five epochs. Batch size and gradient accumulation steps are one. These are
00:07:56 working best for faces. Class batch size is one. Because i'm not going to generate classification
00:08:00 images. Why? Because I have pre-generated classification images, these are uploaded to a
00:08:06 google drive, so you will be also able to download and use them. The link will be in the description.
00:08:10 I'm not checking gradient checkpointing. The learning rate I am using is this one 0.000002. The
00:08:18 resolution is 512 because we are using 1.5 pruned ckpt. The sanity prompt is photo of ohwx man by
00:08:25 tomer hanuka. I'm not changing the sanity sample. It will be same in all of the RunPod instances in
00:08:31 all of the trainings. These are not set because we don't need. I am using optimizer 8Bit Adam.
00:08:36 I am not using EMA because this is increasing vram usage a lot. If you have 12 gigabyte vram
00:08:42 having gpu, you won't be able to run with the same settings. Mixed precision bf16 memory attention
00:08:47 xformers if your card is not supporting bf16, if it is an old card, then you need to select fp16 so
00:08:54 it depends on your gpu model. Cache latents, train UNET, step ratio of text encoding is 75 percent,
00:09:01 offset noise is zero, freeze clip normalization layers is unchecked. Clip skip is one AdamW
00:09:07 Weight decay is one percent and the other settings are same default. In the concepts
00:09:13 we are setting our data set directory as training images and classification images.
00:09:18 This is my training images data set as usual. No same backgrounds, no same clothings, different
00:09:24 shots of the face from different distances. I'm not using filewords because when teaching faces,
00:09:31 I find that no need to use filewords file captions so my instance prompt is ohwx man.
00:09:37 Ohwx is my rare token. Man is my class token. This will be used when training training images
00:09:43 class from this photo of man. This will be used when it is training for regularization to keep
00:09:48 the model sanity and prevent over training. So when you generate your classification images,
00:09:53 you should use same prompt that you have typed here. Sample image prompt is photo of ohwx
00:09:59 man. Okay, now this part we are going to use 50 classification images per training instance: the
00:10:05 classification scale seven these matters if you generate these new classification images through
00:10:10 the DreamBooth extension, so you should also set these values. I'm going to generate two samples
00:10:16 during training. The sample seed will be minus one okay, and these are the other settings. Okay. In
00:10:22 saving tab, we are going to generate a ckpt file when saving during training. So we did set save
00:10:30 every 10 epoch. So every 10 epoch it will generate a ckpt file. Make sure that you have sufficient
00:10:35 amount of hard drive so this will generate 20 checkpoints and it will take roughly 100 gigabytes
00:10:43 disk space. And in the testing tab, I am going to pick deterministic so it should be same in all of
00:10:50 the training experimentation. This is supposed to make your training reproducible. Save settings,
00:10:57 hit train, and the training for first experiment, the base experiment, the test zero is started. Now
00:11:04 time to start other tests. Okay, when generating test one, I am going to show you a neat trick I
00:11:10 am using. I am first selecting test zero. This test zero settings exist in this machine as well
00:11:15 because I have cloned it. Load settings then go to the create tab. Type test1. Select the pruned ckpt
00:11:21 file, hit create model. Now test one is selected here and the settings are loaded so they didn't
00:11:28 changed when a new model is generated so that I won't be need to set all of the parameters again
00:11:34 and again. The only different parameter for test one is use DEIS for noise scheduler so it is an
00:11:40 option a new option here. It changes the noise scheduler and I am setting deterministic save
00:11:46 settings. You can also click load settings to be sure, then click train and it will start training.
00:11:51 So test one has been started now time to go for test two. The only different thing in test two is
00:11:57 unfreeze model and so we are going to test it. Create model with unfreeze model option okay,
00:12:03 save settings load settings. Test two is selected, make sure that you see hit train and test two is
00:12:10 started okay. Test three is Lion optimizer which is a new optimizer we select in the advanced tab
00:12:17 which is here. Instead of optimizer 8Bit Adam we are going to select Lion. When we use Lion
00:12:23 optimizer, it is suggested to use 10% of the original learning rate, so I will set it as
00:12:30 that. So I am changing my learning rate to this one. You see 0.0000002 hit save settings and hit
00:12:39 train. So the test three has been started. Okay for the test four, we are going to use the new
00:12:45 feature Offset Noise. If you wonder what is Offset Noise, it is supposed to improve generating dark
00:12:51 or light images easily. So which value of Offset Noise we should use? Recommended value is between
00:12:57 five percent and ten percent. So I will pick this seven percent and this setting is available in the
00:13:04 advanced tab. In the bottom, you will see Offset Noise. So let's set it as seven percent like this
00:13:10 okay and hit save settings, load settings, make sure that the correct model is selected,
00:13:16 hit train and it is started. Okay in the test five, we are going to use Freeze Clip
00:13:22 Normalization Layers which is an another option in the advanced tab. So i'm setting it, hit save,
00:13:28 load settings and hit train. So the test five has already started. Okay, now time to do the
00:13:34 tests that require more than 12 gigabyte vram having gpus. So the first test we are going to
00:13:41 do is EMA plus EMA for prediction. If you have only 12 gigabytes having GPU, unfortunately,
00:13:48 you won't be able to run this test. For this test to work first we need to enable EMA during
00:13:53 training from advanced tab. Use EMA. Then we need to pick use EMA for prediction, save settings
00:14:00 and the training has been started. Unfortunately, the graduate interface is frozen. This may happen
00:14:05 sometimes, but still you can see the process in your terminal window as you can see right now.
00:14:12 My test seven RunPod is currently do not have any GPU available. Therefore, I will run test seven
00:14:20 on my test zero RunPod. So you may encounter this problem. When you encounter this problem,
00:14:26 you can still start your RunPod, get your files, download them. However, you have to wait until a
00:14:33 GPU gets free on this RunPod, to use GPU on that RunPod. Okay for test seven, we are going to use
00:14:40 EMA from here and then we are going to enable use EMA weights for inference in the saving tab, save
00:14:48 settings, load settings, and hit train. So the test seven is also started. Now time to do test
00:14:55 eight. In the test eight only difference is use EMA so let's set it and start the training. Use
00:15:01 EMA here. Save settings, load settings, and hit train. Okay, I have encountered the configuration
00:15:08 error. This may happen sometimes. Just reload the web UI, refresh model list, select your model,
00:15:15 load settings. You may want to quickly verify your settings. Hit train and now it is started
00:15:21 you see. Okay for the test nine we are not going to use xformers. To be sure that it is not used
00:15:28 I will disable xformers command from the command line arguments from here webui-user.sh file: once
00:15:38 you do this, you need to restart webui obviously. Okay looks like test nine Pod also, don't have any
00:15:45 GPU at the moment, so I will run this test nine on test one Pod. Because this has been completed,
00:15:51 I will add more disk space to it and use that. Okay, now time to do test nine. In the test nine,
00:15:58 we are not going to use xformers. I have started the webui instance without xformers, so now in
00:16:04 the advanced tab, I will select memory attention default mixed precision fp16 deterministic save
00:16:11 settings, load settings, make sure that settings are correct and hit train. You can also see that
00:16:18 no module xformers proceeding without it in this instance because the xformers is removed from the
00:16:25 command line arguments. Okay, now I noticed something else. When you use EMA + EMA for
00:16:32 prediction, it uses nine gigabyte VRAM. However, when you use EMA plus EMA weights for inference,
00:16:39 it uses 12.5 gigabyte VRAM. And when you use only EMA, it uses 12.5 gigabyte VRAM. And when we don't
00:16:48 use xformers and when we use fp16, it is using 17.4 gigabyte VRAM and we are not even using EMA.
00:16:57 You see, EMA is false in these ones EMA is true. EMA is true. EMA is true because this is test six,
00:17:04 test seven, test eight and now we are running test nine. Okay, looks like for test nine to run,
00:17:12 we have to enable xformers, but we will not pick xformers in the settings because it is not working
00:17:19 when we run it without xformers. Yeah, this is definitely an error probably in the programming,
00:17:25 so it is trying to use somewhere xformers even if we don't pick it. Yeah. Okay,
00:17:31 so I am restarting the test nine this time. Xformers flag is given, but I am just selecting
00:17:37 default fp16 so let's see if it will work this time or not. Okay, load settings hit train. Okay,
00:17:43 it is using 17.4 gigabytes VRAM again. So since we didn't enable xformers, but I hope we don't
00:17:51 get error this time. Okay, looks like error were caused when it was generating samples.
00:17:57 Probably the error cause was it was expecting to use xformers during generating image, but it was
00:18:04 not enabled and you see after generating samples the VRAM usage also dropped. This is really really
00:18:09 interesting. There is definitely some bug in the script, but it is working so far. Good. In
00:18:15 the experiment six, I have encountered errors two times. For example, the last encountered error is
00:18:22 happened at the step 4320. So basically I am going to continue training with the remaining epochs.
00:18:31 How many epochs I need more? I need 20 epochs because our target epoch is 200 and the model
00:18:37 epoch is currently 180. Therefore, set the epoch count as 20. Save settings, load settings, and
00:18:45 hit train and then you can continue training if you encounter such errors during training. Okay,
00:18:50 all tests have been completed. I have downloaded the samples of all tests. Let's take a look at
00:18:56 the samples. This is the base test. Test zero. My face becomes like me after like 480 steps. So
00:19:04 let's take notes. Test zero. Okay, 480 steps beginning we can say then when it loses its
00:19:12 styling ability is okay i'm checking. Yeah, it's still able to keep up. Okay, after 2040 steps and
00:19:21 then it becomes you see really really bad as you can see. Okay, let's take a look at the test one.
00:19:28 So in the test one the differences we are using DEIS noise scheduler. The face becomes like me
00:19:35 after again 480 steps. And when does it loses its styling capability is i'm checking. I think
00:19:44 it loses its capability same yeah 2040. Okay, this is test two in the test two yeah 480 steps. Oh I
00:19:52 see a nice one here. Okay and when does it loses its styling capability is okay i am checking that.
00:20:00 So like 2040 steps again. So nothing different for test two as well. Which is unfreeze model. By the
00:20:08 way we will compare how well they are performing when we use a certain prompt so do not worry about
00:20:15 it. I have a systematic for that I will explain. So let's check out the test three. Okay in the
00:20:21 test three we start to see some difference than the previous ones. We can say the face starts to
00:20:28 appear after yeah yeah let's say okay yeah none of them is really looking good. So maybe maybe I will
00:20:37 say this one 1080 steps and when does it loses its styling capability is i'm checking. Wow it
00:20:45 is really able to keep styling pretty decent if you ask me. So when does the test three lost its
00:20:53 styling capability is yeah right after the best one 3120 steps. Okay so in the test three we were
00:21:02 using Lion optimizer. Looks like Lion optimizer is yeah looks like better than the previous one
00:21:08 actually. So the Lion optimizer made the biggest difference among all other settings so far. Okay,
00:21:15 now let's look at the test four. In the test four, we see that styling capability of the model is
00:21:22 lost pretty quickly just after 600 steps and and the face also appeared pretty quickly. We can say
00:21:31 that after 600 steps, so let's take note of them. Okay now let's see the test five results. In the
00:21:39 test five I see that the face appeared in the 480 steps and the styling capability is lost after
00:21:48 2040 steps so not much looking like different from the first ones. In the test5 we used freeze clip
00:21:57 normalization layers, but there is a styled image here which looks like pretty decent. So we will
00:22:05 see the results of these when we are doing full comparison with x/y plot and a nice prompt. So I
00:22:12 have taken note of the test five as well. Let's move to the test six. Okay, in the test six,
00:22:18 we see that the face somewhat appeared after only 480 steps, but none of it looking good
00:22:26 actually. So the face is looking weird than the original training data set. Unlike the others,
00:22:32 the styling capability of the model is never lost. This is pretty significant. It is always
00:22:39 able to stylize the model, stylize the output you see never lost. So this is a very significant
00:22:48 difference than others. But the face is not looking good, so we will see how it will behave
00:22:53 when we are doing comparison. I am wondering that as well. In the test six, you see we are using EMA
00:23:00 plus EMA for prediction. So this is the first test I have used EMA as an experiment and the results
00:23:07 are significantly different than others. Actually or styling, I will save 4320 after that not much
00:23:15 styled. This is just looking as black and white. So the result is for test six is 400 steps first
00:23:21 face and 4320 steps loses styling ability. Now let's look at the test seven. In the test seven,
00:23:28 we see that the face appeared almost as soon as 240 steps or I will take 480. Yeah, this
00:23:37 looks bad. Yeah, this looks better. 480 steps and the styling capability is quickly lost. Actually,
00:23:45 after 840 steps after that, I don't see the styling and then it quickly becomes pretty
00:23:53 over trained. So this became over trained pretty quickly. So the test seven is 480 steps first face
00:24:00 and 840 steps loses styling ability. In the test seven, we used EMA + EMA weights for inference.
00:24:08 As you can see in both cases, the model trained pretty quickly, but this one has much better faces
00:24:15 than EMA for prediction and the test seven. Okay, in the test seven, we see that the first face 360.
00:24:23 Yeah. And then the styling ability is lost after 840. Yeah, this is like the previous one pretty
00:24:32 much over trained quickly. So for test 8, 360 first face and 840 steps loses styling ability.
00:24:41 So in the test eight we only used EMA weights. The face is looking decent but it became over trained
00:24:49 pretty quickly so we will see how it will perform in the end. And for the test nine. In this test
00:24:55 we used the best settings of the default settings. But we didn't use xformers and instead of xformers
00:25:02 we used default memory attention. So let's see how it performed. This is supposed to working
00:25:08 better than the original because xformers is supposed to reducing your training quality. Okay,
00:25:13 I see the first face somewhat in 600 steps and the styling capability is kept until 1560 steps. The
00:25:22 results are looking decent. We will see in the end how it performs. I can say that without xformers
00:25:28 it is trained faster than with the xformers version. Okay, we have done all the comparison
00:25:34 of the samples. Now time to do final comparison. Okay now time to do our final tests. For these
00:25:43 first we are going to find a good seed. Then with this seed and the prompts we have prepared, we are
00:25:50 going to compare every checkpoint of every trained model. Test one test two, test three, test nine,
00:25:57 test zero. Then we will analyze these checkpoints of every test and get their best checkpoint and
00:26:06 we will compare all of the best checkpoints of all trained models and we will decide which one
00:26:12 is performing best. So I am going to find my good seed on the test zero, which is our base test and
00:26:21 I am going to use 50 epoch of the base test. Actually, how did I decide this 50 epoch to
00:26:28 test it to find the good seed? This is same as our previously known experiment 50 epoch and I decided
00:26:36 it by looking at the sample images. 50 epoch is looking like best 1200 steps. Okay, let's find the
00:26:44 good seed first. Okay, I think I got a decent seed to test that. I have shortened the prompt for both
00:26:51 negative and positive prompt because very long prompt drives the image away from yourself from
00:26:58 your subject. So making too long prompt is not very good in some cases. This applies for negative
00:27:05 prompt as well. I am going to share all of the prompts and other settings used in this experiment
00:27:11 on a gist post on my GitHub. This will be public and this will be in the description. So let me
00:27:19 show you the images. This is the first image. These are the original training images and this
00:27:24 is the first image. This is the second image. This is the third image. This is the fourth image. This
00:27:30 is the fifth image. This is the sixth image. This is the seventh image and this is the eighth image.
00:27:35 So let's test this setup on all of the training sets on all of the experiments and find out the
00:27:44 best checkpoint for each of the experiment. So for testing this, we are going to use: x/y plot. Open
00:27:51 the x/y plot in the below. But before doing that, verify that your seed is working as expected. You
00:27:57 need to use the seed of the first image displayed in here because when you generate images as batch,
00:28:03 it starts from your seed and increase it by one for other images. So if you use the second image
00:28:09 as a starting point, then the ending image will be different than this data set. Okay, in the x/y
00:28:15 plot, we are going to use checkpoint name. A box should appear here to fill all of the checkpoints.
00:28:23 If it is not appearing like in my case right now, you should refresh your web UI instance, which
00:28:29 I am going to do. Okay after I did refresh, it appeared. Click this. Select the checkpoints that
00:28:35 you want to test. You see now it is automatically generating safe tensors files instead of ckpt
00:28:41 files. This is the safe version of ckpt. This file cannot contain any harmful applications hidden
00:28:48 inside your model. Okay, all settings are set for test zero. I am also going to add a grid margin 20
00:28:56 pixel like this. Also, do not check keep minus one for seeds. Otherwise you won't be able to compare
00:29:03 same seed between different checkpoints and then hit generate. Okay, all tests have been completed.
00:29:09 Now I will show you where you can download the generated image. It is inside, outputs inside,
00:29:15 text to image grids. In, go to the latest folder and in here you will see a png file which will
00:29:23 be pretty big. For example, 72.9 megabytes. Right click download. So this is how you can
00:29:30 download the grid files. All grid files have been downloaded. No we will compare them and see which
00:29:37 one of the checkpoint is performing best. So let's begin with the test: zero. Okay, in the test zero,
00:29:44 it is starting from epoch 10 and every 10 epoch is saved. So this is epoch 10. This is epoch 20.
00:29:52 Still not my face. Epoch 30 I see my face. I am starting to see my face. Epoch 40 50 and it goes
00:30:01 on 60 70, 80, 90, 100, 110 epochs okay, I see a very good clear image here. The stylizing and
00:30:13 the quality is still significantly good. 120 130, 140 the quality is still really good. It
00:30:22 looks like they have improved the DreamBooth extension. Okay, on 150 or even more. This
00:30:29 is okay. This is 180 you see stylizing is almost gone. Uh, it is just drawing out my image. Okay,
00:30:37 this is 190 epochs but the quality is still good. And this is the 200. Yes, we can see it is already
00:30:45 becoming over trained. So which epoch is best? It is hard to decide which epoch is looking best. It
00:30:53 is a subjective term of course, but from images. Yeah, I am having hard time to decide which epoch
00:31:00 is looking best. They are really all good quality, but this one is looking like the best one. So this
00:31:07 is epoch 140 epoch 130 is also pretty decent so it is hard to decide. As I said, but I will say 140
00:31:17 epoch test 0 140 epoch looking like best. 30 60 steps. Okay, now time to move test1. Okay, so in
00:31:27 the test one, let's see even in the 10 epoch there is some similarity, but not much. In the 20 epoch,
00:31:34 there is certainly similarity of the face. In the 30 epoch it becomes more like my face and in 40
00:31:42 epoch it is even getting better. This is 50 epoch looking very good. 60 epoch looking very decent by
00:31:50 the way. If we remember test one was use DEIS for noise scheduler, we are seeing pretty good results
00:31:56 overall. So it is again hard to decide which one is looking best. But I can say that after
00:32:02 100 epochs it becomes more like over trained. I see that stylizing is kind of reduced or not
00:32:10 Maybe? Yeah, really really hard to decide. All of them is looking pretty good. Good quality. There
00:32:16 is no degradation in the faces. Pretty decent. Okay, so how are we going to decide which one of
00:32:24 the epoch is best for test1? It is really really good. Actually, it only becomes over trained in
00:32:30 the very end as you can see, so this is the 200 epoch, 190 epoch 180 170. Okay, so according to
00:32:41 me let's see which one is best. Really hard to decide. It is really hard to decide, but I think
00:32:47 I will go with 3120 steps which is 130 epoch looking best. Okay, now we are at the test two,
00:32:56 which is unfreeze model. Let's see if there is any significant difference between the base or not.
00:33:04 Okay, I see decent images, but let's see which one is best. It has added me a good mustache
00:33:10 actually. Even though none of my pictures have mustache, let me show you once again. So you see,
00:33:16 I have no mustache in any of them. But there's a good mustache here. Okay, so which one is best?
00:33:22 This is the million dollar question. Which one is looking best? wow, this is really cool one. Okay,
00:33:28 let's see. This is looking very very good. The images are looking very very good. I think
00:33:33 there is an overall improvement in the DreamBooth training when compared to previous versions. Okay,
00:33:39 this is 130 epoch, 140 epoch. Actually, this is 150 epoch. This is 160 epoch for test two 170 180
00:33:52 epoch 190 epoch. Each epoch is still looking good, producing different images and this is 200 epoch.
00:34:00 The results are really really good. So which one should we take? Yeah, I am having hard time once
00:34:07 again to decide because they are all looking good so it is not like one of them is best. So what
00:34:13 could you do in such situation? You could do batch processing and generate images on the different
00:34:20 epochs and pick the best ones you like. So you would use all of the epochs and you would get
00:34:27 the best images you want. You don't have to stick one epoch. You can choose the best working epochs
00:34:34 and generate images on all of them with xy plot. This strategy would work very well. For this video
00:34:40 I think I am going to take 100 epoch. Yes, I will take 100 epoch for test two. Okay, now test three
00:34:49 in the test three. We did Lion optimizer this is a new optimizer that has been just added and
00:34:56 let's see in the 10 epoch, not my face. 20, not my face. 30 somewhat similar 40 still, yeah, somewhat
00:35:05 similar 50 somewhat okay 60 epoch stylizing is much better. You see it is completely stylized.
00:35:12 It is following the prompt. Significantly better than others as you can see. Okay, we are almost at
00:35:19 100 epoch. But yeah, the pictures are really good. I think this optimizer is working better than the
00:35:26 other for stylizing, but it is up to you to pick the which one you want. Okay, this one is looking
00:35:31 pretty decent so it is hard to decide. This is 100 epoch. This is 110 epoch. This is 120 epoch. This
00:35:42 is 130 epoch. This is 140 epoch. This is 150 epoch. This optimizer is making a significant
00:35:50 difference in your training results. So this is 160 epoch. I see some over training perhaps. Yes,
00:35:58 okay, this is 180 epoch. And yeah, pretty decent. The results are pretty pretty decent. Okay,
00:36:06 hard to decide which one to take. This is 190 epoch. Wow. The quality is amazing. Okay, so which
00:36:14 one should we take? This is 190 epoch. Yeah, very good image there is. And this is 200 epoch. Okay,
00:36:22 wow. This is like taken from a movie. It looks like taken from an real movie actually. Okay,
00:36:29 so which one should we take and still not over trained I think. So with new Lion optimizer the
00:36:37 over training problem looks like in the past. I mean it is harder to over train now because I can
00:36:44 clearly see that it is much more able to follow my prompt. Why? Because in all pictures I have
00:36:51 armors. As you can see, this is a significant difference from the others so it is much more
00:36:57 able to follow our prompts. In all images I see that I have armor like I have typed.
00:37:05 So this was the prompt that I have used. Face of ohwx man wearing royal armor. You see with
00:37:12 Lion optimizer it is much more able to follow and obey this prompt because I see a royal armor is
00:37:21 worn in every image. This is not the case in other images. Of course the face similarity is somewhat
00:37:28 problematic, but it is following the prompt very well. This means that it is not over trained
00:37:33 with my subject. This is extremely good, extremely useful, and this is a new discovery for DreamBooth
00:37:42 training. Yes, I like it. This is pretty pretty good. So which epoch should we take? This is the
00:37:49 million dollar question. Yes and I can't decide it. All of them is looking good. All of them is
00:37:56 somewhat different. You see like these pictures are from taken real movie. As you can see this
00:38:02 is really interesting results. Okay, which one should we take? I think I will go with 150 epoch.
00:38:09 It looks like I was playing in a movie. They are looking really good but all of them is really
00:38:16 decent. So hard to decide. But for this one I will decide this. Okay so now time to check out test4.
00:38:22 Okay in the test4 we got significantly different results even though we were using the same seed.
00:38:28 This is really interesting. You see even though everything was same during the training test4 is
00:38:35 significantly different. Why? Because in test4 we have used offset noise and I can clearly
00:38:42 see that the images generated by it has much more significant black and whites. Yes, this looks like
00:38:51 the case. The overall images are like more black. So if you want to obtain more black or white
00:38:58 then you should use offset noise. You see there is a clear difference between the black background
00:39:04 and the white foreground. This is significant discovery. Yes, I can clearly see the lightning
00:39:12 it has added. You see the blackness it has added. So this is another significant discovery. Yes,
00:39:18 which image is looking best. However, the stylizing capability is not looking very well,
00:39:24 but the images are still pretty significant. So in the 200 steps, there is overtraining in the
00:39:32 200 epochs. In the 190 epochs. Yeah, results are not very good, but you can clearly see
00:39:37 the black and white differences like the article mentioned. Okay, so which epoch is best to choose?
00:39:44 This is the 150 epoch. It is looking decent. So if you want to generate black and white images,
00:39:50 this is really, really good. If you want to obtain contrast, this is really, really good. Okay, this
00:39:57 is 130 epochs, 120 epoch, 110 epoch, 100 epochs. This looks pretty decent. 190 epoch: this is also
00:40:09 looking pretty decent. So which one should we take and 100 and this is actually this 90 epoch. This
00:40:16 is 80 epoch. In the 80 epoch the face is not very much like me, so yes, this probably requires more
00:40:24 training. I think I will go with 100 epoch. This looks like the best one. Yeah, okay, now we are
00:40:31 seeing the test5 results. In the test five we have tested freeze clip normalization layers and let's
00:40:37 see if there is any significant difference in the test five. Let's look at them like this. When we
00:40:44 compare all the best checkpoints we will get a more clear idea between these minor differences.
00:40:52 So this is 100 epoch. This is 90 epoch. This is 110 epoch. This is also looking pretty decent one:
00:41:00 120 epoch 130 epoch 140 150, so 160. Okay, it looks like the quality is degrading after this.
00:41:12 Which one should we take for this one? Okay, I think I will take the same as the base one so we
00:41:19 can see if there is any difference. 140 epoch i'm going to take. Okay, now the results of EMA. There
00:41:26 is a significant discovery in this one as well. Let me explain to you, the face is not like me.
00:41:32 But the important thing is I think it is much more able to stylize. Let me show you what I mean. All
00:41:41 of the images are stylized, so this methodology perhaps could be used for teaching a style. This
00:41:49 is a wroth to try and you are learning this in this video in our channel. So please subscribe,
00:41:55 make a comment, share. And if you be a Patron supporter us I would appreciate very much.
00:42:00 Every image is stylized. It is perfectly able to keep styling so you see it is never becoming like
00:42:07 original subject. All images are stylized. Even the 200 epoch you see it is fully stylized. The
00:42:14 face is not like me, but it is able to follow style very well. The style is amazing. This is
00:42:21 able to keep the style perfectly fine. The face needs to be worked on definitely. Probably I
00:42:28 need to modify the prompt and maybe I can get the better face. But style is amazing. It is perfectly
00:42:35 able to keep style in every image. So this is a new discovery. I think this is worth to experiment
00:42:41 with and test out. Okay, for test six, I am going to use 100 epoch actually 100 110 120 130. I can't
00:42:54 decide but 120 epoch? yes. But for this one, we really need to test different seeds, different
00:43:03 prompting because the way prompts work on this one is significantly different than others. So
00:43:10 let's pick 120 model for this one. Okay, in the test seven, let's see the results: 10 epoch 20
00:43:18 epoch 30 epoch 40 epoch 50 epoch 60 still not much my face. 70 epoch yes, I see some similarity. 80
00:43:28 epoch 90 epoch 100 epoch I think this is also able to follow our prompts much better. 110 epoch 120
00:43:38 epoch yes, significantly different than others 130 140 so if you use these new things EMA for
00:43:47 prediction or EMA weights for inference then you will be need to change your prompting style than
00:43:53 before. It is still fully stylized. You see there is no overtraining. The images are not becoming
00:43:58 raw our subject even in the 200 epochs. Yes, even in the 200 epochs, it is not becoming our subject.
00:44:08 Let me show you what I mean by a comparison. So in the base model this is the 200 epoch you see
00:44:15 simply my face. Not much stylizing, but in this model it is completely stylized based on our
00:44:22 prompt. But the face is not much our face. So there is a significant difference between how
00:44:28 you prompt when you use EMA weights for inference or when you don't use EMA weights for inference.
00:44:34 Or if the same rule applies for when you use EMA for prediction. It completely changes the output
00:44:42 as you can see. So for this one which one should we take, there is not much difference between each
00:44:49 of the epochs actually. It is almost. Yeah, it is producing same. This is almost same, not the
00:44:56 same. You see there is a difference in here, so it changes, but it changes pretty insignificantly
00:45:02 between different epochs. I wonder that if these method needs more training, more training
00:45:09 than 200 epochs, perhaps it is requiring. I see somewhat similarity in this one. This epoch. Yes,
00:45:16 there are some similarities in this one as well. I think I am going to pick the 120 epoch. Yes,
00:45:22 so let's pick the 120 epoch for test 7. Okay, now we are seeing the test 8. In the test 8
00:45:29 we use EMA only and no other options. So for comparison I think I will pick the same epoch
00:45:37 of the base model which is by. But before let me show you each one of the epochs. So this is 200
00:45:43 epoch. This is 190 epoch. There is a significant difference between the base when you use EMA. So
00:45:49 if you don't have a high VRAM having graphic card like me, there is a significant difference that
00:45:56 we are missing perhaps. So 200 epoch 190 epoch 180 epoch 170 epoch not much changes when you use EMA.
00:46:06 This is something that I noticed. So which epoch is looking best? By the way, this image can be
00:46:12 fixed very easily with inpainting. Hopefully I am planning a video for that as well. So for distance
00:46:17 shots, don't worry that you can fix them. Okay, this one is looking pretty decent. 110 epochs this
00:46:24 is looking pretty decent. 120 also looking pretty decent. Maybe I should pick this one. 130 is also
00:46:32 pretty decent, so it is again hard to choose the best one. Okay, so which one should we take?
00:46:39 Perhaps I have saved too many checkpoints with 10 epochs. That is why we are having hard time.
00:46:44 Perhaps I should have made them 25 epochs. It would make the difference more significant between
00:46:51 different checkpoints. Okay, I think I am going to take 130 epoch. This looks like the best. Okay,
00:46:57 now test nine, let's see what difference do the xformers is making in the test nine. We didn't
00:47:05 use xformers default memory attention and fp16 is used, so let's see what kind of difference we
00:47:12 have. By the way, this has significantly increased the VRAM usage. So you need 24 gigabyte to use
00:47:20 this. I think there is a significant difference between the base model and this one. Yeah, the
00:47:28 results are stunning. When we don't use xformers, I like them. So let's see which one is looking
00:47:34 best. So let's start from 200 epochs. So this is 200 epoch. Pretty much over trained you see, there
00:47:41 is no styling at all almost. 190 180, 170, 160, 150, 140, 130, 120 110. Yes, 120 is also still
00:47:58 not stylized outputting my face. But in 110 all images are stylized and this is 100 epoch. So 110
00:48:08 is looking like the best one. This is 90, this is 90. This is 80. This is 70. Yeah, I think I will
00:48:17 go with 100, 110 epoch or 100. I cant decide. Yes, I think I will go with 110 epoch for test nine.
00:48:26 So now I will download all of these checkpoints and I will make the final grid file. Currently I
00:48:35 am downloading four of the model files by using runpodctl. Let me show you, I am able to get 100
00:48:42 megabits download speed by downloading multiple files at the same time. My internet speed is
00:48:49 100 megabit. I am and I am almost at full. I have started the ports with only cpu option so they are
00:48:56 using lesser credits of me. If you don't know how to use RunPod I have explained everything about
00:49:03 RunPod in this ultimate tutorial video including how to use RunPodctl. Currently I am having
00:49:10 problem to connect jupyter lab of this RunPod. So I have connected to web terminal and in the web
00:49:16 terminal. This is an interface that you get. When you type dir, you will get the directory listing.
00:49:23 So in this directory listing I am going to go models and in here I will go Stable diffusion:
00:49:29 okay, inside this model, I am going inside cd test 2 and when I dir it will show me all of the files.
00:49:38 I am going to download this file. RunPodctl send test 2 2400 safetensors. So now file is getting
00:49:47 downloaded. If you encounter such jupyter error, you can also use this interface and when you click
00:49:54 again, connect to the web terminal it will open another terminal for you. Okay, so all files have
00:50:00 been downloaded and I have moved all of them into my local folder and my local instance of web UI is
00:50:07 running. Now I am going to do the final test with the same prompt on all of the models. All test
00:50:14 files are checked from the checkpoint name test01. By the way, this is missing some. So what we need
00:50:23 to do is click refresh here. Click refresh here again. Okay, zero, one, two, three, four, five,
00:50:30 six, seven, eight nine. All is there, the seed is set, batch size is set, CFG value is set,
00:50:37 and now time to test. There is one thing that I want to mention. Test seven has lesser size than
00:50:43 the other ones because in test seven we used Use EMA plus, use EMA weights for inference. Inference
00:50:50 means that when you are going to generate an image a new image, therefore only EMA weights are saved
00:50:57 and it is lesser than the original weights. You see this is 2.21 gigabytes and other ones are
00:51:04 3.81 gigabytes. So the final experiment has been completed before delving into it. Please consider
00:51:12 subscribing our channel. Support us by joining if you are able to. Join our discord channel. You
00:51:19 will find the link in here or in the description or the comment of the video. And if you support
00:51:26 us on Patreon, I would appreciate that very very much. Currently we have 27 patrons and they are
00:51:32 helping me tremendously. I am hoping that you will also become our Patreon. Thank you so much! Okay,
00:51:38 now time for final comparison. This is test zero. Test one and let's see the results. Okay, I will
00:51:47 slowly show all of the results. This is test zero as you can see, pretty good, pretty decent
00:51:53 quality. This is our base test by the way. This is test one. In the test one. What difference do
00:52:00 we have? Both of them are looking very very good. Can we say one of them is better? So the test one
00:52:07 seems like a little bit better than the original and in the test one we used DEIS for noise
00:52:13 scheduler. Therefore, I think this is improving the overall quality. Not very significantly,
00:52:19 but it is increasing. Okay in the test two, the results are little bit more different than the
00:52:26 test one. You see different pictures generated. This is extremely well stylized but not very
00:52:32 similar to my face. So which one is better? Okay, apparently unfreezing model makes difference,
00:52:38 but is that difference is better or not? This is the ultimate question so I will compare it
00:52:44 with the base test. I am copying this, opening a new page. Double size. Okay, so this is test zero
00:52:51 and this is test two. Which one is better we can say. I think base model is looking a little bit
00:52:59 better. So I can't say unfreeze model making huge difference. It is certainly making difference, but
00:53:06 we can't say huge difference so it is up to you to use it or not. I think both of them are decent,
00:53:12 so it may be a little bit improving or maybe not improving, but I can't say this harmful to enable
00:53:19 this so you can use it from now on. Both results are very decent and when we do unfreeze model,
00:53:27 it took lesser steps to get this quality so that is another important thing to consider. Okay,
00:53:34 this is test three. Let's also compare test three to test base test. Okay,
00:53:39 now we are seeing test zero versus test three. In this case, which one is better. Test three
00:53:46 is more looking like from a real movie if you ask me. So I can say that the quality of test
00:53:52 three is better. The face is somewhat not like me in this test, but the quality is better so it is
00:54:00 up to your taste. If you prefer, you can go with test three configuration which is Lion optimizer.
00:54:06 You may obtain better quality images in terms of reality with Lion optimizer, but it looks like
00:54:15 not exactly my face so therefore it may require more fine tuning of the prompts. But if you can
00:54:22 do that then you may get good results. Better results than the base test for sure. Okay now
00:54:29 test4. In test4 the difference is very clear. You see the darkness and whiteness of the images are
00:54:35 much better because as mentioned in this article, it allows Stale Diffusion to generate very dark or
00:54:43 light images easily. Therefore, you see the images have much better dark and light difference so it
00:54:51 is up to your taste which one you prefer, but this is significantly changing the output results. Let
00:54:57 me show you the comparison. So this is base test versus offset noise. It is up to you to
00:55:03 prefer which one of them you want. If you want to have much better light and dark images then
00:55:10 you should go with offset noise. Or you can set the offset noise lower than I did so it may not
00:55:16 have this much significant impact, but it clearly improved the light and dark difference you see.
00:55:24 Okay now test five. In the test five, this is the results comparison. I can't say test five
00:55:32 is looking better than the base model actually. So what did we use in test five? We used freeze clip
00:55:38 normalization layers. I can say that it didn't improve well. The images are almost exactly same,
00:55:45 but which one is better. For example, let's zoom in this one. Freeze normalization layers results.
00:55:51 And this is not freezing, so let's make a better comparison. Okay, so which one is looking better?
00:55:58 Actually, when we look at the fine details, there is a slight difference in here and in
00:56:05 here they are almost looking like same. So I don't know what kind of effect did this freeze
00:56:13 normalization layers made. They are almost looking like same. There are also some subtle differences
00:56:19 in differences in here you see, but I don't know. I think I would probably wouldn't use freeze clip
00:56:28 normalization layers option because I don't see any benefit and it might be slightly worse than
00:56:35 the original. Not sure it is up to you. Okay, now test six. The difference is really, really
00:56:41 huge in test six because it's completely changed the output being while we are using same seed and
00:56:48 same prompt. Okay, in the test six, the face is definitely not my face. It is pretty different,
00:56:54 but it is much more stylized than the original results as you can see, so therefore it is up to
00:57:01 you to use which one you want. In the test six we have used Use EMA + use EMA for prediction.
00:57:08 Therefore, the results are significantly different, but more stylized. This may
00:57:14 work better for when we teach a style, so this is worth to test it. Experiment it for styling. Okay,
00:57:21 now we are seeing test seven results. The results are significantly different in this one as well.
00:57:26 They are not also very good. The faces are not very good, not like me. It is highly stylized,
00:57:32 so therefore these may work better for teaching styles. You see. This is test seven. Use
00:57:38 EMA + use EMA weights for inference. So it is up to you to test this and if you like it more than
00:57:44 you can use it. But this changes how you need to prompt than before. This is for sure. Okay,
00:57:50 this is test eight. The results are different once again, however, they are good in their way. This
00:57:57 is more stylized. The face is mine definitely. They are looking pretty decent as well. So when we
00:58:04 use EMA I think we are able to keep styling better than when we not use it. Therefore, if I had more
00:58:11 VRAM I think I would use EMA for sure. Because it is certainly improving our success rate. When we
00:58:17 are able to generate more stylized images always we can modify our prompt to get what we want,
00:58:24 so use EMA weights definitely increasing the learning rate success because you see more
00:58:31 stylizing means that our model learned with lesser over training. So the weights are more generalized
00:58:39 in the contextual underlying context of the model. Therefore, use EMA weights definitely improves
00:58:46 the success rate of the training. However, this requires more than 12 gigabytes VRAM. Probably
00:58:52 you need minimum 16 gigabytes of VRAM. So this is the negative side of the EMA weights. However,
00:59:00 you can always hire a RunPod and do your training on that. If you hire a RunPod,
00:59:05 please register through my referral link which you will find in the description and in the comment
00:59:11 section. I would appreciate that very much. Okay, now this is our final comparison. Test nine versus
00:59:18 test zero. The only difference is that in test nine we didn't use xformers and I can say that
00:59:25 definitely test nine is looking better. It is a personal opinion of course you may find the other
00:59:32 one is better. However, I find that the test nine is able to stylize better also keeping my face.
00:59:39 In some pictures the face is not very similar, but it is, I think definitely better. However,
00:59:45 this is requiring much more VRAM. So you really need a higher VRAM having GPU for this to work.
00:59:53 So this is also out of options for many of the people I saw that it was using 17 gigabytes of
00:59:59 VRAM. Therefore, you probably need 24 gigabytes VRAM having card for this or you can use RunPod.
01:00:07 So you will see the sign up link for RunPod in the description and in the comment section of this
01:00:13 video. If you use these links I would appreciate that very much. That is my referral having link.
01:00:18 This is all for today. I literally spent one day just for recording video and then I will be have
01:00:25 to spend a lot of hours to post process this video. Prepare fully manually fixed subtitles.
01:00:32 Prepare the sections of the video. So please like, share, subscribe. Support us by joining,
01:00:39 support us through patreon. I would appreciate that very much. Hopefully see you in another
01:00:44 awesome video and thank you RunPod for providing me credits to run these amazing experiments!
Beta Was this translation helpful? Give feedback.
All reactions