Replies: 1 comment
-
|
I understand perfectly. I am myself very cautious about the dataset quality. I can't really help you because I do not have found the perfect methodology / settings to produce good results, especially for body moves. I use some tips/tricks founds in a civitai tutorial and this is my (current) configuration: I'm not pretty sure that the Prodigy optimizer is very usefull here, but working well with the Cosine Annealing LR scheduler. The learning rate is fixed at 1 and decrease to 0 along the whole training step in cosine fashion. The Prodigy optimizer manage the best LR himself. Here the The tutorial use For my current training session, I use the The training start at an average loss of The only advice is can tell even if it remains a feeling is that mixing images/videos training is a bad idea. It seem that a little dataset of pictures is perfect for a Character LoRA but losing certain physical attributes and character traits (mouth move, general allure ...) I'd like to experiment with a method that would involve training a multitude of LoRAs with limited datasets. Portraits, full-body shots, single images in specific poses and attitudes. Then, training on short videos for specific body movements. Next, I'd like to try merging the LoRAs to see if it's worth the effort. Obviously, large datasets are a bad idea. In my case, 27 images already seem like a bad idea; I think 13 to 17 images seem reasonable, provided you don't overdo it with the same images. Front, profile, close-up, half-body, and full-body are sufficient to achieve good results. The same goes for video-based training. There's no need to produce long videos. Keeping a few seconds of a movement (1 to 3 seconds) is more than enough. I tried with long, chunked videos, but it had no impact other than slowing down the training and wasting GPU power. So I'm looking for a method to merge LoRAs trained on the Hunyuan model to experiment with the idea. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all, I've recently returned to trying to train Hunyuan after having great success with Wan and... I just can't get decent results. I've tried all kinds of different combinations of settings and what got me closest is the same settings I use for Wan - Dim 16, Alpha 16, LR 2e-5, constant_with_warmup for 2400 steps and it's closer to what I'm after but still not like the stellar results I've been getting with Wan there's a lot of issue with fine hand movements or physical interactions. I've been considering maybe revisiting cosine scheduling at this point, I've tried LR from 2e-5 to 2e-4, Dim 8-32, Alpha = 1, half, dim, loraplus, no lora plus, etc. My datasets are all very high quality(I'm kinda obsessive about that), 20-30 videos hand captioned in a descriptive style, cropped to remove watermarks and provide AR variation and normalized to 24fps. I've read lots of guides and stuff and even written my own at this point but I'm just not getting the results I've seen others achieve. Any help is appreciated!
Beta Was this translation helpful? Give feedback.
All reactions