Automatic1111 Stable Diffusion DreamBooth Guide: Optimal Classification Images Count Comparison Test #285
FurkanGozukara
announced in
Tutorials
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Automatic1111 Stable Diffusion DreamBooth Guide: Optimal Classification Images Count Comparison Test
Full tutorial: https://www.youtube.com/watch?v=Tb4IYIYm4os
Sign up RunPod: https://bit.ly/RunPodIO. Our Discord : https://discord.gg/HbqgGaZVmr. New best training settings for DreamBooth training in Automatic1111 Web UI. If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰 https://www.patreon.com/SECourses
Playlist of #StableDiffusion Tutorials, #Automatic1111 and Google Colab Guides, #DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img:
https://www.youtube.com/playlist?list=PL_pbwdIyffsmclLl0O144nQRnezKlNdx3
Easiest Way to Install & Run Stable Diffusion Web UI on PC by Using Open Source Automatic Installer:
https://youtu.be/AZg6vzWHOTA
How to use Stable Diffusion V2.1 and Different Models in the Web UI - SD 1.5 vs 2.1 vs Anything V3:
https://youtu.be/aAyvsX-EpG4
Zero To Hero Stable Diffusion DreamBooth Tutorial By Using Automatic1111 Web UI - Ultra Detailed:
https://youtu.be/Bdl-jWR3Ukc
Sketches into Epic Art with 1 Click: A Guide to Stable Diffusion ControlNet in Automatic1111 Web UI:
https://youtu.be/vhqqmkTBMlU
Ultimate RunPod Tutorial For Stable Diffusion - Automatic1111 - Data Transfers, Extensions, CivitAI:
https://youtu.be/QN1vdGhjcRc
8 GB LoRA Training - Fix CUDA & xformers For DreamBooth and Textual Inversion in Automatic1111 SD UI:
https://youtu.be/O01BrQwOd-Q
2400 Photo Of Man classification images:
https://drive.google.com/file/d/1qBf8VyUbmPNalKqm076yOsQjE8BrcG7R/view
00:00:00 Introduction to Best Settings of DreamBooth training experiment
00:00:56 How to close initially started Web UI instance on RunPod Stable Diffusion template
00:02:20 Which RunPod machine you should pick for DreamBooth training and why
00:02:48 The used versions in this experiment such as Automatic1111 version, xformers version, DreamBooth version
00:04:20 Best DreamBooth settings for 0 classification images
00:04:45 How to continue DreamBooth training from a certain checkpoint
00:05:12 Used command line arguments for best DreamBooth training
00:05:20 Used extensions list for best DreamBooth training
00:05:45 Starting to set parameters for 0 classification images - equal to fine tuning
00:06:45 Used training dataset and what dataset features you need
00:07:45 Setting concepts tab of DreamBooth training
00:08:00 When you should use FileWords and why you should use for fine tuning and how to do fine tuning
00:10:15 Best training setup parameters for DreamBooth training when using classification images
00:11:28 How to calculate number of steps for each epoch
00:13:17 All trainings are completed
00:13:49 Comparison of sample and sanity sample images generated during training
00:13:55 Analysis of 0x classification samples
00:14:41 Analysis of 1x classification samples
00:15:14 Analysis of 2x classification samples
00:15:36 Analysis of 5x classification samples
00:16:12 Analysis of 10x classification samples
00:16:34 Analysis of 25x classification samples
00:16:45 Analysis of 50x classification samples
00:17:28 Analysis of 100x classification samples
00:17:49 Analysis of 100x classification samples
00:18:09 Comparing each checkpoint in all of the trained models
00:18:46 How to use x/y/z plot to check different training checkpoints
00:19:51 All grids are generated and how did i download them
00:20:40 Analysis of 0x classification x/y/z grid images
00:21:58 Analysis of 1x classification x/y/z grid images
00:23:10 Analysis of 2x classification x/y/z grid images
00:24:03 Analysis of 5x classification x/y/z grid images
00:25:00 Analysis of 10x classification x/y/z grid images
00:25:36 Analysis of 25x classification x/y/z grid images
00:26:15 Analysis of 50x classification x/y/z grid images
00:27:27 Analysis of 100x classification x/y/z grid images
00:28:02 Analysis of 100x classification x/y/z grid images
00:29:00 Summary of the experiment
00:29:40 Very important speech part
Text-Guided View Synthesis
Our technique can synthesized images with specified viewpoints for a subject cat (left to right: top, bottom, side and back views). Note that the generated poses are different from the input poses, and the background changes in a realistic manner given a pose change. We also highlight the preservation of complex fur patterns on the subject cat's forehead.
Property Modification
We show color modifications in the first row (using prompts
a [color] [V] car''), and crosses between a specific dog and different animals in the second row (using promptsa cross of a [V] dog and a [target species]''). We highlight the fact that our method preserves unique visual features that give the subject its identity or essence, while performing the required property modification.Accessorization
Outfitting a dog with accessories. The identity of the subject is preserved and many different outfits or accessories can be applied to the dog given a prompt of type "a [V] dog wearing a police/chef/witch outfit''. We observe a realistic interaction between the subject dog and the outfits or accessories, as well as a large variety of possible options.
Video Transcription
00:00:00 Greetings everyone.
00:00:01 In this video I am going to conduct a massive experiment of number of classification images
00:00:06 effect when doing Stable Diffusion DreamBooth training.
00:00:09 In the community: there are widely varying numbers for how many classification images
00:00:13 to use per training instance image.
00:00:15 In the official paper 200 classification regularization images are used per training image.
00:00:20 So in this video I am going to conduct the experiments written here.
00:00:23 I will use 9 RunPod instances to do 9 different DreamBooth training.
00:00:28 So all of my instances are running right now.
00:00:30 I have prepared one instance, then cloned all of them between.
00:00:34 To do this I have used runpodctl command.
00:00:37 I have zipped the Stable Diffusion web UI folder, venv folder and the classification
00:00:41 images folder, then sent them between the different RunPods.
00:00:46 Moreover when you start your RunPod with the template, it starts a hidden web UI instance
00:00:53 but you are not able to close it from the jupyter interface.
00:00:57 So for closing it first I have changed the relauncher.py like this.
00:01:03 So I have added a while loop break here.
00:01:06 When it is closed it won't relaunch again and again.
00:01:10 And to close the initial hidden web UI I have used this kill command.
00:01:15 With this command, you are killing the running web UI instance on the port 3000.
00:01:21 Then we will manually launch these web UIs.
00:01:26 So in my My Pods section currently you see: CPU utilization and memory usage are zero.
00:01:32 That means that no instance of Automatic1111 web UI currently running on your pod.
00:01:37 So if you don't know what is Stable Diffusion, what is Automatic1111 web UI I have excellent
00:01:42 tutorials for them.
00:01:43 In this tutorial you can learn how to install and run Automatic1111 web UI on your computer.
00:01:49 In this tutorial, you will learn how to do DreamBooth training from zero to hero.
00:01:54 In this tutorial I am explaining what is new awesome fantastic ControlNet and how to use
00:01:59 it on Automatic1111 web UI.
00:02:02 And this is the ultimate RunPod tutorial.
00:02:05 So if you are interested in this, you can watch this playlist.
00:02:09 Now I will begin with starting my web UI instances in all of the RunPods.
00:02:15 All Automatic1111 instances are started.
00:02:18 Let me show you the versions and the pods that I am using.
00:02:21 So it is really important to pick your pod correct for DreamBooth training.
00:02:26 I have chosen RTX A4500 pods.
00:02:30 Why?
00:02:31 Because as you can see, these pods have 62GB RAM and 20GB VRAM.
00:02:38 So having more RAM is really important when doing DreamBooth training.
00:02:42 If your pods do not have enough sufficient amount of RAM then you may get gradio killed
00:02:49 error which is extremely annoying.
00:02:51 So the versions I am using are python revision 3.10.9 for this experiment.
00:02:57 DreamBooth revision is this one and the SD Web UI revision is this one.
00:03:02 I am using xformers 0.0.17.dev464.
00:03:08 Why?
00:03:10 Because either you have to use 0.0.14 or 0.0.17 version xformers, otherwise, DreamBooth training
00:03:18 will not work.
00:03:20 This is a very commonly question that I have been getting asked.
00:03:23 On Windows you should downgrade your xformers to 0.0.14 revision and I am explaining that
00:03:30 in this video.
00:03:32 On Unix you should upgrade your xformers to 0.0.17 development revision and in this video
00:03:38 I am explaining that.
00:03:40 I have pre-prepared 2400 classification images.
00:03:45 To generate these images a simple prompt used which is our classification regularization
00:03:51 prompt.
00:03:52 The prompt used is photo of man and I have used sampling steps as 40 and nothing else
00:03:57 is different.
00:03:58 I have used version 1.5 pruned ckpt file and in the settings and in the Stable Diffusion
00:04:05 settings the default vae used as the newest vae the best vae available as you can see
00:04:11 right now here.
00:04:13 So I will share the link of this classification data set in the description as a zip file.
00:04:18 You can download and use them if you want.
00:04:21 Now I will show you two of the settings.
00:04:24 First one is zero classification and the second one will be 1x classification.
00:04:29 The rest will be same.
00:04:30 First we will begin with generating our training model.
00:04:35 This one will be 0x plus.
00:04:38 I will pick the file from here 1.5 pruned ckpt file as a source checkpoint.
00:04:45 Sometimes I am getting asked that how you can continue your training from certain checkpoint.
00:04:51 Just make a new model and pick your source checkpoint from here.
00:04:57 Like this and it will generate a new training model from that certain checkpoint that you
00:05:02 want to continue.
00:05:03 It is the same thing and I am not touching the other things.
00:05:06 They are default and best settings.
00:05:08 One another thing that I want to mention is that I have only used these command line arguments
00:05:14 as you can see nothing else is special or different and these are the only extensions
00:05:19 I am using right now they are the latest version.
00:05:22 Okay all model files are generated in all of the RunPod instances.
00:05:28 For example, this one is 0x classification images.
00:05:30 This one is 1x classification images 2x, 5x, 10x, 25x, 50x, 100x, and 200x.
00:05:41 So I will show two of the setups first.
00:05:44 Let's begin with no classification images.
00:05:47 This will be basically a fine tuning.
00:05:49 First I will click performance wizard.
00:05:51 Then I'm not going to use any classification images.
00:05:53 I am going to train 200 epochs.
00:05:56 I will save model every 25 epochs.
00:05:58 I will save preview image every five epochs.
00:06:00 I'm not going to change batch size or gradient accumulation steps.
00:06:03 These will affect your training success rate as well.
00:06:08 Because this is mini batches versus higher batches like full batches.
00:06:12 This is a debated topic in the machine learning.
00:06:15 I will use gradient checkpointing.
00:06:17 Why?
00:06:18 Because in this video, I am going to set up the settings as you can do in your computer
00:06:23 with only 12 gigabyte vram having graphic card.
00:06:27 Therefore, I will use gradient checkpointing.
00:06:30 The learning rate I see that the learning rate is increased when we click performance
00:06:35 wizard or by default I am going to use a lower learning rate like this.
00:06:39 I'm not going to use center crop or apply horizontal flip.
00:06:43 My images are already prepared by me.
00:06:46 This is my training data set.
00:06:47 You see every background is different.
00:06:50 Clothings are different.
00:06:51 Only face is common in the images so you should make common only the things that you want
00:06:57 to teach to the model.
00:06:58 The sanity sample prompt will be photo of ohwx man by tomer hanuka.
00:07:05 So the ohwx is our rare token.
00:07:08 Man is our class token.
00:07:09 However, this is zero classification model so we are not going to have any class token
00:07:15 in this particular one.
00:07:17 So for zero classification images, it will be only photo of ohwx by tomer hanuka to see
00:07:23 how it performs during training.
00:07:25 I'm not going to use ema because this is as I said, 12 gigabyte VRAM having experiment.
00:07:32 I am going to use bf16.
00:07:34 This is supposed to have better precision, but if your graphic card is not supporting
00:07:38 this then you should pick fp16.
00:07:40 We are using xformers and these are the versions I am showing once again.
00:07:45 The other values are will be like this.
00:07:47 In the concepts I will set the data set directory like this.
00:07:52 This is the data set directory that is containing these training images.
00:07:56 We are not setting any classification because this is 0x class.
00:08:00 I'm not using any [FileWords] because [FileWords] is something that you want to use when you
00:08:06 want to do fine tuning.
00:08:09 That you want to improve quality of lots of tokens like you are teaching castles, rivers,
00:08:15 mountains, and other things and you have beautiful images, then you should caption those images
00:08:20 with the keywords that you want to improve and associate with them.
00:08:24 If I want to show you an example, let's say you want to improve castle images and you
00:08:30 have this image for fine tuning, then you should caption this file as awesome fantastic
00:08:36 castle in a beautiful forest with a awesome river.
00:08:41 And when doing DreamBooth training, all of these tokens will get associated with this
00:08:47 image and they will get improved along with the unet and the text encoder.
00:08:51 This is when it is useful to use [FileWords].
00:08:54 So when you use [FileWords] like this, it will read the caption of that particular training
00:09:00 image and replace here with that caption.
00:09:04 So your instance prompt for that image will become the caption that you have used.
00:09:09 But for teaching faces I am just only using a rare token and I am not using any captions
00:09:14 or other things, we are not going to use class prompt for this training.
00:09:19 For sample, I will use photo of ohwx and I am not adding class token and you see class
00:09:25 images per instance and other settings.
00:09:28 I think we need to set this one as zero for it to work correctly and in the saving I will
00:09:34 save them into a sub directory.
00:09:36 I will also generate a ckpt file during saving.
00:09:39 So these are my saving settings.
00:09:42 So all settings are ready.
00:09:43 I will click save settings and hit train.
00:09:46 Now in the terminal window we can see the training has started.
00:09:50 You see number of batches each epoch is 12 because we are not using any classification
00:09:56 regularization images.
00:09:57 Number of epochs 200, text encoder epochs is 150, because we did set ratio of text encoder
00:10:05 training is 75 percent.
00:10:07 The other settings are displayed here.
00:10:09 It is pretty fast on this graphic card as you can see the loss is looking good.
00:10:14 The used vram is 9.6 gigabytes.
00:10:17 Now I will set up the next one which is 1x classification images.
00:10:23 This one is simply same.
00:10:25 What is different in this one.
00:10:27 In this one the only different thing is I am adding the class prompt into the sample
00:10:33 prompt and also I am adding classification data set directory like this as you can see
00:10:39 and in the instance prompt I am going to use ohwx man.
00:10:42 Ohwx is our rare token.
00:10:44 Man is our class token and the prompt that I have used to generate classification regularization
00:10:50 images is photo of man.
00:10:53 So you should write same thing because the reason is that we are wanting to keep the
00:10:58 underlying context of the model as much as possibly same.
00:11:02 We want to have prior-preservation loss, so the sample image prompt will be photo of ohwx
00:11:10 man like this and I will use one class images per instance I am just setting it like this
00:11:18 and since it has already classification images, it won't generate any new one.
00:11:23 Save settings.
00:11:24 Hit train!
00:11:25 Okay, this is the terminal of 1x classification images.
00:11:28 Now number of batches each epoch is 24 because now it is using bucketing system.
00:11:34 Therefore, it adds in each epoch, classification images and the training images as well.
00:11:40 So it is a double size of the training images that you have.
00:11:43 The other settings are like this, same as the previous one and it started training.
00:11:48 We can see here.
00:11:50 The rest of the training will be same.
00:11:53 Only the number of classification images per instance will be different.
00:11:57 Okay, all trainings are started.
00:11:59 You see class images 24 used.
00:12:02 This is 2x.
00:12:03 You see class images 60 this is 5x. 120 class images this is 10x. 300 class images this
00:12:10 is 25x. 600 this is 50x.
00:12:14 This is 100x 1200 class images and 2400 class images this is the 200x.
00:12:21 So each one is going on.
00:12:23 I see that the stylizing capability of the 0x is gone already and in this one the stylizing
00:12:31 in 1x class also gone in epochs 74 in 2x class I see that still stylizing in epoch 39.
00:12:40 In 5x ues, stylizing is still available but learn rate is not very good yet 38 epochs.
00:12:47 This one is just started 10x class.
00:12:50 Uh, it had error so I had to restart.
00:12:53 Okay with 25x class 36 epochs it looks like the best.
00:12:57 And with 50x class I see that the stylizing is the best so far.
00:13:02 It is really good with 100 class stylizing is good with 200 class stylizing is not very
00:13:09 good but this is also the epoch 24 yet.
00:13:12 So when you use more classification images, can we say that it is taking more time?
00:13:17 I'm not sure yet.
00:13:19 Okay, now we need to wait all of them to finish and we will then compare.
00:13:24 Okay, all trainings are completed.
00:13:26 Let me show you quickly you see model epoch is 200 200 200 and each one is different.
00:13:34 You see 5x classifications 200.
00:13:36 So all of the trainings are completed without any errors now I am going to download samples
00:13:44 generated during training in all of the RunPods and we will start with comparing them.
00:13:50 I will analyze and comment on them.
00:13:53 Okay, samples are downloaded here.
00:13:55 So I will begin with the 0x classification images.
00:13:59 When we do not use any classification images, the generated samples are like this: okay,
00:14:05 we see that a good styling at 960 steps here and the face is looking decent as well.
00:14:13 The styling capability of the model is lost after 1260 steps.
00:14:19 So how are we going to calculate the epoch count for this?
00:14:23 When you divide it by 12, we are going to find the epoch count.
00:14:28 So this is 105 epoch.
00:14:31 So until 100 epochs.
00:14:33 It was able to stylize when we do not use any classification images.
00:14:37 So we can say that with 80 epochs we could get good results when we do not use any classification
00:14:42 images, we will test that.
00:14:45 Let's look at the results.
00:14:46 When we use only single classification image, we see that it started learning pretty fast.
00:14:53 This is a really good styling.
00:14:54 However, the styling capability of the model is lost pretty quickly after 60 epochs and
00:15:01 the generated samples quality is also decreased a lot.
00:15:04 It also looks like memorized the images instead of learning my face.
00:15:09 So 1x classification doesn't look very good.
00:15:13 We will see results.
00:15:14 Okay, this is 2x classification.
00:15:16 2x means that we have used 2 multiplied by 12 24 classification images just reminding.
00:15:23 It is looking decent for 40 epochs and after 60 epochs it becomes very bad actually.
00:15:30 It also looks like memorized the images not learned the face so we will see the results.
00:15:36 Okay, this is 5x classification images and after 60 epochs or 50 epochs it looks like
00:15:44 decent and it loses its styling pretty early also, after 100 epochs.
00:15:50 Okay, these are the sample images.
00:15:52 By the way, these images are not styled are not beautified.
00:15:56 They are just raw prompts.
00:15:58 Let me show you.
00:15:59 So you see, this is a raw prompt photo of ohwx man and this is the raw sanity prompt
00:16:04 photo of ohwx man by tomer hanuka.
00:16:06 We could add more beautifying tokens to these prompts and also negative prompts to improve
00:16:13 them.
00:16:14 Okay, let's see 10x classification.
00:16:16 In the 10x classification it keeps its styling ability much more as you can see even at the
00:16:23 190 epochs, it is still able to stylize my face with tomer hanuka style.
00:16:28 So you see as we increase the number of classification images, we are preventing over training certainly.
00:16:33 Okay, this is 25 classification images and somehow the styling capability is once again
00:16:41 lost quickly after 60 epochs.
00:16:44 Okay, 50x classification images.
00:16:46 When we use 50 classification images per instance, I see that it is the best one that keeps styling.
00:16:56 Also looks like learning the face.
00:16:58 So in this experiment I find that 50x is the sweet spot for my training data set.
00:17:06 I can't say it will be for you, but 50x looks like a good choice for now.
00:17:10 We will test and see it.
00:17:12 Started to learn the face slower than the others.
00:17:15 I think after 60 epochs it becomes somewhat decent and the best one looking like 180 epochs
00:17:23 like this.
00:17:24 This one also looking decent.
00:17:26 Let's look at the 100 classification images.
00:17:28 Okay, in the 100 classification images, the results are not fascinating.
00:17:33 Actually, the sanity prompt becomes very very irritating.
00:17:37 Also, the sample prompts as well.
00:17:39 I don't know why it is like this, but maybe there is a bug in the code that causes some
00:17:45 errors.
00:17:46 Problems?
00:17:47 it doesn't look good at all.
00:17:49 200 classification images is same.
00:17:52 After 150 epochs it becomes very bad.
00:17:55 Actually very very bad.
00:17:56 I don't know.
00:17:57 This is very weird.
00:17:59 The prompts are looking correct so this is weird.
00:18:02 I also checked the settings and verified the settings are correct and yes they are correct.
00:18:07 It certainly tried to learn the face, but the results are not very good.
00:18:11 So now what are we going to do is I have prepared a prompt and found a seed that shows my face
00:18:19 in seven of the eight generated images.
00:18:23 They are all my face.
00:18:24 They are stylized as you can see.
00:18:27 So the aim here is comparing each checkpoint and see how each model will perform.
00:18:34 And how am I going to do that?
00:18:35 I will copy the prompt in the each model.
00:18:39 Then I will copy the seed of this prompt.
00:18:43 Like this.
00:18:44 I will set the batch size as eight.
00:18:46 This is important.
00:18:48 Are we done?
00:18:49 no.
00:18:50 We are also going to use x/y/z plot and in x/y/z plot we are going to test different
00:18:56 checkpoints.
00:18:57 You see I have checked checkpoint name here.
00:18:59 I click this.
00:19:00 It will fill the checkpoints like this.
00:19:02 I will delete the first one and I am going to test all of the checkpoints.
00:19:07 So how many images this will generate?
00:19:09 This will generate eight multiplied by eight 64 images.
00:19:13 Because we have eight checkpoints, this is 20 epoch, 40 epoch, 60 epoch this is 25 epoch
00:19:21 50 epoch 75 epoch 100, epoch 125 epoch 150 epoch 175 epochs and this is 200 epochs.
00:19:32 I am not going to test other values because I want to see what it will generate for the
00:19:38 same settings in the different trained models.
00:19:42 By the way, this is the first zero classification example.
00:19:45 So we don't need man in this one, but in others, we will use man the class prompt and we are
00:19:51 ready to test.
00:19:52 Okay, all grids are generated.
00:19:54 However, they are not displayed on the gradio unfortunately.
00:19:58 So what did I do?
00:19:59 I went to the text to image grids folder and downloaded all of the images one by one and
00:20:05 now they are ready and now time to compare them.
00:20:09 But before doing that, I will now close my pods since we are done with the pods and time
00:20:14 to evaluate the results.
00:20:16 You see currently, I am using 3.37 dollars per hour time to close them.
00:20:22 After closing all the pods I am using 0.25 dollars per hour because currently I am using
00:20:29 900 gigabytes volume exited volume you see exited.
00:20:35 Therefore, it is spending 0.25 dollars per hour from my credits.
00:20:39 Okay, let's begin with 0x classification results.
00:20:43 In the 0x classifications there is one thing that you need to be careful.
00:20:48 It starts with 300 steps because we didn't use any classification images.
00:20:53 Therefore, each epoch equals to 12 steps.
00:20:58 The number of training images I have.
00:21:00 When we use classification images they are also included in the bucket.
00:21:04 Therefore, it will be double of this size.
00:21:07 So this is 25 epoch.
00:21:09 These are the results of 25 epoch, not very like my face.
00:21:13 These are the results of 50 epoch a very low similarity.
00:21:17 These are the results of 75 epoch and yes this one is becoming more similar.
00:21:24 Okay, this one is 100 epochs.
00:21:26 Yes, I see similarity in this one especially.
00:21:29 These are the results of 125 epochs.
00:21:33 As you can see the similarity increases.
00:21:35 However, I wouldn't call them very good results and after 150 epochs it starts to lose stylizing
00:21:42 and also my face.
00:21:44 And these are the results of 175 epochs and these are the results of 200 epochs.
00:21:51 So the best spot for 0x classification is 100 epochs.
00:21:56 The 1200 steps.
00:21:58 Let's look at the 1x classification results.
00:22:01 You see as I said, the 1x classification starts from 600 steps because classification images
00:22:07 are also included in the bucket.
00:22:09 Therefore, one epoch is 24 steps.
00:22:12 So let me show you each one of them.
00:22:15 This is 50 epochs.
00:22:16 The I see the similarity here but not very similar also.
00:22:21 These are the 75 epochs and the styling ability is becoming lesser as you can see.
00:22:27 Okay, this is the 100 epoch.
00:22:29 As you can see the results are not very good.
00:22:33 These are the 125 epoch.
00:22:35 Even though I said close shot.
00:22:37 You see they are all distant shots.
00:22:39 These are 150 epochs.
00:22:41 These are 175 epochs.
00:22:43 And this is the 200 epochs.
00:22:45 You see it is very much over trained and the quality is very bad as you can see.
00:22:50 So the best epoch for 1x classification is.
00:22:54 We can say 25 epoch.
00:22:56 After 25 epoch the results are not better.
00:23:01 So when you use 1x classification, 25 epochs is the sweet spot for my training data set.
00:23:07 It may change for you.
00:23:09 Okay now time to analyze 2x classification images.
00:23:13 In the 25 epoch these are the results as you can see.
00:23:16 This is not at all my face or the other one.
00:23:20 This is the 50 epoch and now it starts to resemble my face much better.
00:23:24 It is stylizing but not the best results.
00:23:27 So this is the 75 epoch and I can see my face in here, but the styling is lost pretty quickly.
00:23:35 This is 100 epochs as you can see.
00:23:38 It almost lost all of its stylizing capability and after 100 epochs you see it is just simply
00:23:45 printing my face without following our prompt.
00:23:49 Also it starts over training.
00:23:50 Yes, as you can see it is very much over trained at this point.
00:23:55 So the sweet spot for 2x classification looks like 50 epoch as you can see.
00:24:00 Generating a lot of images you may get what you want, but this is still not very good.
00:24:04 Okay now we are at the 5x classification.
00:24:07 In the first image none of the images are like me.
00:24:10 This is the 25 epoch.
00:24:12 This is the 50 epoch and some resemblance starts.
00:24:16 And this is the 75 epoch.
00:24:18 You see as we increase number of classification images, it takes more time to learn our face.
00:24:24 However, it is also able to stylize better.
00:24:28 This is the 75 epoch quality.
00:24:30 This is 100 epoch quality.
00:24:33 This is 125 steps.
00:24:34 Okay, this is 150 epochs as you can see.
00:24:38 Only this one is actually in armor.
00:24:41 So it's already over trained a lot.
00:24:44 So for the 5x classification, the sweet spot we can say is 75 epochs.
00:24:50 It is almost fully stylized.
00:24:52 All of the images are stylized, and all of the images are similar to me.
00:24:57 So therefore this is the sweet spot.
00:25:00 Okay, now time to see 10x classification.
00:25:02 In the first image the resemblance is good.
00:25:05 This is only 25 epoch.
00:25:08 This is 50 epoch.
00:25:09 It is stylized but not very much following our prompt.
00:25:12 This is 75 epoch.
00:25:15 Still stylized but not very much like we are targeting.
00:25:21 This is 100 epochs.
00:25:23 This is 125 epochs.
00:25:25 I think it is starting to over training.
00:25:27 This is 150 epochs as you can see the quality is decreased.
00:25:31 There are some problems errors.
00:25:34 This is 175 epoch and this is 200 epoch.
00:25:37 Very bad quality.
00:25:38 Okay, this is 25 classification images.
00:25:42 The first one is not at all like me.
00:25:44 This is 25 epoch.
00:25:46 This is 50 epoch.
00:25:47 It is stylized but not very similar to me.
00:25:49 This one looks like me.
00:25:51 But not very good.
00:25:52 This is 75 epoch.
00:25:53 I can say this is better than 50 epoch and this is 100 epoch.
00:25:59 It already looks like over trained.
00:26:02 Some major problems in the images and this is 125 epoch.
00:26:07 Already very much over trained and the rest is also.
00:26:09 You see it has memorized it, even the elevator or the backgrounds.
00:26:14 Okay, this is 60 classification images.
00:26:17 In the first image I can see the resemblance.
00:26:19 Some very good styling as well.
00:26:21 This is the 50 epoch and the results are really good actually if you ask me, with 50 epoch
00:26:28 and 50 classification images probably I can get whatever I want in a stylized manner.
00:26:34 The distance shot is also decent.
00:26:36 When you want to have distance shot then you need to upscale this image maybe with high
00:26:43 res fix and you can then in paint your face.
00:26:46 Then you can obtain very good images with that approach.
00:26:50 So this is uh so this is 75 epoch, still very much stylized.
00:26:55 I can see a very decent quality.
00:26:58 You see this looks pretty good one.
00:27:00 This is 100 epochs.
00:27:02 It is starting to lose styling capability.
00:27:05 This is 125 epochs and I can see it is memorized and producing bad quality images.
00:27:13 After 125 it starts over training so you can alternatively reduce the training speed by
00:27:18 halving it and it may help you to maybe obtain better ones.
00:27:24 And this is very bad.
00:27:25 You see totally over trained.
00:27:26 Okay, now time to see 100 classification images.
00:27:30 In the first one, there is almost no resembling.
00:27:33 This is the 25 epoch.
00:27:34 This is 50 epoch and I can see resemblance.
00:27:37 Actually, these results are also pretty decent for 50 epochs, but this is not like me.
00:27:42 In the 75 epoch, we see these are the generated images.
00:27:46 Yes, it is stylizing, but it is not very well and this is 100 epochs and the quality is
00:27:53 very bad and these are the rest.
00:27:56 It starts over training after 100 epochs, even for 100 classification images.
00:28:01 Okay, now we are at the 200 classification images.
00:28:05 In the 25 epoch version, there is some resemblance, but not very good.
00:28:10 In the 50 epoch, there are some more resembles but not very good either.
00:28:15 By the way, even though we are using same seed, that doesn't mean they are equal between
00:28:21 different trainings.
00:28:22 The same seed should produce similar results in the same training.
00:28:26 But in the different training, then we can say it will be same.
00:28:30 Same seed basically means that it will start from the same noise and then generate the
00:28:35 image with denoising.
00:28:36 Okay, this is 75 epochs.
00:28:39 The results are decent.
00:28:41 For 75.
00:28:42 Actually, I see stylizing some good decent results.
00:28:46 For 200 images, this is 100 epoch.
00:28:49 As you can see, uh, the styling ability is starting to lose once again and this is 125
00:28:55 epoch.
00:28:56 Okay, this is 150 epoch and it is already over trained.
00:29:00 So what is the the summary of this experiment?
00:29:05 As you increase the number of classification images, it doesn't mean you will get better
00:29:10 results.
00:29:11 From this experiment I can say that 50 classification images yielded best results for me.
00:29:17 That is my sweet spot for 12 training images.
00:29:21 I can't say it will be same for you, but 50 images looks like a sweet spot.
00:29:26 Also, you should take more checkpoints and compare them as I did and find your best checkpoint.
00:29:32 This is really important because difference between different checkpoints are huge.
00:29:38 You should find the best checkpoint that will work best for you.
00:29:42 Okay, this is all for today.
00:29:43 I hope you have enjoyed.
00:29:44 Please like, subscribe, leave a comment.
00:29:47 I will put the used prompt in the comment section.
00:29:50 Please also support us on patreon.
00:29:52 This is really important.
00:29:54 The patreon link will be in the description and also in the comment section.
00:29:58 You see so far we have 25 patrons.
00:30:00 I am hoping that you will also become our patreon.
00:30:03 I am hoping that you can also be for sharing, liking, making a comment and becoming our
00:30:09 patron.
00:30:10 Also, you can make a comment and tell me what you want to see next.
00:30:14 Hopefully see you in better more awesome videos.
Beta Was this translation helpful? Give feedback.
All reactions