Dockerized way to fine-tune FLUX.1-dev model using Dreambooth metod with memory-efficient and configurable training,
Dockerhub: kopyl/train-flux-kohya-sd-scripts
Simplified training run guide.
Throughout my ML career i found that the best way to train diffusion models is with kohya's sd-scripts repo.
One of the coolest advantages I like:
- Way less bugs than huggingface/diffusers repo's training scipts have;
- Latest and gratest models to trainl
- Amazing speed optimization which i could not get myself re-writing huggingface/diffusers' Flux training code;
- Not very difficult to understand in terms of the architecture and the code readability
While being really nice to use, you have to spend a while setting up a simple Flux Dreambooth training environment, which is basically:
- Install everything a repo requires you to. Sometimes even a newer version of Python. And I always had to install additional packages which are out of scope of the required ones by the sd-scripts repo like
torchvisionandopencv; - Copy all the models into your project environment (and find them on the internet if you not happen to casually store 30gb of data on your computer).
- Deploy a container (or run on your own machine);
exec -it {name} bashinto a container;- Upload photos of your subject to
datasetdirectory and change the subject's identifier if needed (likesks woman) indataset-config.toml; - Run the training script like
bash run-training.sh; - When the training is finished, you will find the trained Flux model (transformer type) in the
/outputdirectory of a container
-
Remove
--apply_t5_attn_maskparameter. It slightly increases quality and slightly reduces training speed (in my measurements on a server with NVIDIA H100 2.53s/it with attention mask and 1.91s/it without). So far i guess it's worth the time sacrifice. VRAM usage is around the smae; -
Change
--save_every_n_epochsparameter; -
Change
--max_train_epochsparameter. But for the most optimal training process i recomment keeping it at 300; -
Change
--sample_every_n_epochsparameter; -
Change
--learning_rateparamet; -
Adjust contents of
sample_prompts.txtfile to fit the subject token. I.e: if class token issks man, prompts start witha photo of sks man...and I'm training a model on a woman, then I'd need to changesks manpart in all prompts tosks woman. Also change the class token in thedataset-config.tomlfile. (prompt formatting syntax)
- Make sure you have the latest version of diffusers from the official repo;
- Do a couple imports:
from diffusers import FluxTransformer2DModel, FluxPipeline
- Load the transformer model you just trained:
transformer = FluxTransformer2DModel.from_single_file("finetuned-model.safetensors", torch_dtype=torch.bfloat16)
- And finally load the default model from black-forest-labs/FLUX.1-dev with swapping the main model component – the transformer with the code like this:
pipe = FluxPipeline.from_pretrained("models/flux-dev-model", transformer=transformer, torch_dtype=torch.bfloat16)