Description
Package
FluxGym , commit: main@284ddc7
When did the issue occur?
Running the Package
What GPU / hardware type are you using?
RTX 4070ti Super 16Gb
What happened?
When I press "Train", after doing all the preparation (including VAE, SFT and CLIP files), a few lines are executed in the console and then the program hangs forever without showing any activity in the TaskManager. (The Forge program works fine.)
Also, when launching FluxGym, its Stability Matrix text windows says "The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable."
(manual installing bitsandbytes-windows did not fixed the subject of this report)
Console output
[2025-03-11 10:23:18] [INFO] Running d:\StabilityMatrix\Packages\FluxGym\outputs\fdsf-jk-kljl\train.bat
[2025-03-11 10:23:18] [INFO]
[2025-03-11 10:23:18] [INFO] d:\StabilityMatrix\Packages\FluxGym>accelerate launch --mixed_precision bf16 --num_cpu_threads_per_process 1 sd-scripts/flux_train_network.py --pretrained_model_name_or_path "d:\StabilityMatrix\Packages\FluxGym\models\unet\flux1-dev.sft" --clip_l "d:\StabilityMatrix\Packages\FluxGym\models\clip\clip_l.safetensors" --t5xxl "d:\StabilityMatrix\Packages\FluxGym\models\clip\t5xxl_fp16.safetensors" --ae "d:\StabilityMatrix\Packages\FluxGym\models\vae\ae.sft" --cache_latents_to_disk --save_model_as safetensors --sdpa --persistent_data_loader_workers --max_data_loader_n_workers 2 --seed 42 --gradient_checkpointing --mixed_precision bf16 --save_precision bf16 --network_module networks.lora_flux --network_dim 4 --optimizer_type adafactor --optimizer_args "relative_step=False" "scale_parameter=False" "warmup_init=False" --lr_scheduler constant_with_warmup --max_grad_norm 0.0 --sample_prompts="d:\StabilityMatrix\Packages\FluxGym\outputs\fdsf-jk-kljl\sample_prompts.txt" --sample_every_n_steps="300" --learning_rate 8e-4 --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk --fp8_base --highvram --max_train_epochs 16 --save_every_n_epochs 4 --dataset_config "d:\StabilityMatrix\Packages\FluxGym\outputs\fdsf-jk-kljl\dataset.toml" --output_dir "d:\StabilityMatrix\Packages\FluxGym\outputs\fdsf-jk-kljl" --output_name fdsf-jk-kljl --timestep_sampling shift --discrete_flow_shift 3.1582 --model_prediction_type raw --guidance_scale 1 --loss_type l2
[2025-03-11 10:23:24] [INFO] 2025-03-11 10:23:24 INFO highvram is enabled / train_util.py:4292
[2025-03-11 10:23:24] [INFO] highvramが有効です
[2025-03-11 10:23:24] [INFO] WARNING cache_latents_to_disk is train_util.py:4309
[2025-03-11 10:23:24] [INFO] enabled, so cache_latents is
[2025-03-11 10:23:24] [INFO] also enabled /
[2025-03-11 10:23:24] [INFO] cache_latents_to_diskが有効なた
[2025-03-11 10:23:24] [INFO] め、cache_latentsを有効にします
[2025-03-11 10:23:24] [INFO] 2025-03-11 10:23:24 INFO Checking the state dict: flux_utils.py:43
[2025-03-11 10:23:24] [INFO] Diffusers or BFL, dev or schnell
[2025-03-11 10:23:24] [INFO] INFO t5xxl_max_token_length: flux_train_network.py:157
[2025-03-11 10:23:24] [INFO] 512
[2025-03-11 10:23:24] [INFO] d:\StabilityMatrix\Packages\FluxGym\venv\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: clean_up_tokenization_spaces
was not set. It will be set to True
by default. This behavior will be depracted in transformers v4.45, and will be then set to False
by default. For more details check this issue: huggingface/transformers#31884
[2025-03-11 10:23:24] [INFO] warnings.warn(
[2025-03-11 10:23:24] [INFO] You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565
[2025-03-11 10:23:24] [INFO] INFO Loading dataset config from train_network.py:488
[2025-03-11 10:23:24] [INFO] d:\StabilityMatrix\Packages\F
[2025-03-11 10:23:24] [INFO] luxGym\outputs\fdsf-jk-kljl\d
[2025-03-11 10:23:24] [INFO] ataset.toml
[2025-03-11 10:23:24] [INFO] INFO prepare images. train_util.py:2049
[2025-03-11 10:23:24] [INFO] INFO get image size from name of train_util.py:1942
[2025-03-11 10:23:24] [INFO] cache files
[2025-03-11 10:23:24] [INFO] 0%| | 0/5 [00:00<?, ?it/s]
100%|██████████| 5/5 [00:00<?, ?it/s]
[2025-03-11 10:23:24] [INFO] INFO set image size from cache train_util.py:1972
[2025-03-11 10:23:24] [INFO] files: 0/5
[2025-03-11 10:23:24] [INFO] INFO found directory train_util.py:1996
[2025-03-11 10:23:24] [INFO] d:\StabilityMatrix\Packages\Flu
[2025-03-11 10:23:24] [INFO] xGym\datasets\fdsf-jk-kljl
[2025-03-11 10:23:24] [INFO] contains 5 image files
[2025-03-11 10:23:24] [INFO] read caption: 0%| | 0/5 [00:00<?, ?it/s]
read caption: 100%|██████████| 5/5 [00:00<00:00, 755.68it/s]
[2025-03-11 10:23:24] [INFO] INFO 50 train images with repeats. train_util.py:2092
[2025-03-11 10:23:24] [INFO] INFO 0 reg images with repeats. train_util.py:2096
[2025-03-11 10:23:24] [INFO] WARNING no regularization images / train_util.py:2101
[2025-03-11 10:23:24] [INFO] 正則化画像が見つかりませんでし
[2025-03-11 10:23:24] [INFO] た
[2025-03-11 10:23:24] [INFO] INFO [Dataset 0] config_util.py:575
[2025-03-11 10:23:24] [INFO] batch_size: 1
[2025-03-11 10:23:24] [INFO] resolution: (512, 512)
[2025-03-11 10:23:24] [INFO] enable_bucket: False
[2025-03-11 10:23:24] [INFO]
[2025-03-11 10:23:24] [INFO] [Subset 0 of Dataset 0]
[2025-03-11 10:23:24] [INFO] image_dir:
[2025-03-11 10:23:24] [INFO] "d:\StabilityMatrix\Packages\Fl
[2025-03-11 10:23:24] [INFO] uxGym\datasets\fdsf-jk-kljl"
[2025-03-11 10:23:24] [INFO] image_count: 5
[2025-03-11 10:23:24] [INFO] num_repeats: 10
[2025-03-11 10:23:24] [INFO] shuffle_caption: False
[2025-03-11 10:23:24] [INFO] keep_tokens: 1
[2025-03-11 10:23:24] [INFO] caption_dropout_rate: 0.0
[2025-03-11 10:23:24] [INFO] caption_dropout_every_n_epo
[2025-03-11 10:23:24] [INFO] chs: 0
[2025-03-11 10:23:24] [INFO] caption_tag_dropout_rate:
[2025-03-11 10:23:24] [INFO] 0.0
[2025-03-11 10:23:24] [INFO] caption_prefix: None
[2025-03-11 10:23:24] [INFO] caption_suffix: None
[2025-03-11 10:23:24] [INFO] color_aug: False
[2025-03-11 10:23:24] [INFO] flip_aug: False
[2025-03-11 10:23:24] [INFO] face_crop_aug_range: None
[2025-03-11 10:23:24] [INFO] random_crop: False
[2025-03-11 10:23:24] [INFO] token_warmup_min: 1,
[2025-03-11 10:23:24] [INFO] token_warmup_step: 0,
[2025-03-11 10:23:24] [INFO] alpha_mask: False
[2025-03-11 10:23:24] [INFO] custom_attributes: {}
[2025-03-11 10:23:24] [INFO] is_reg: False
[2025-03-11 10:23:24] [INFO] class_tokens:
[2025-03-11 10:23:24] [INFO] sdfjklsdkfjsdlkf
[2025-03-11 10:23:24] [INFO] caption_extension: .txt
[2025-03-11 10:23:24] [INFO]
[2025-03-11 10:23:24] [INFO]
[2025-03-11 10:23:24] [INFO] INFO [Prepare dataset 0] config_util.py:587
[2025-03-11 10:23:24] [INFO] INFO loading image sizes. train_util.py:970
[2025-03-11 10:23:24] [INFO] 0%| | 0/5 [00:00<?, ?it/s]
100%|██████████| 5/5 [00:00<00:00, 333.63it/s]
[2025-03-11 10:23:24] [INFO] INFO prepare dataset train_util.py:995
[2025-03-11 10:23:24] [INFO] INFO preparing accelerator train_network.py:562
[2025-03-11 10:23:24] [INFO] accelerator device: cpu
[2025-03-11 10:23:24] [INFO] INFO Checking the state dict: flux_utils.py:43
[2025-03-11 10:23:24] [INFO] Diffusers or BFL, dev or schnell
[2025-03-11 10:23:24] [INFO] INFO Building Flux model dev from BFL flux_utils.py:101
[2025-03-11 10:23:24] [INFO] checkpoint
[2025-03-11 10:23:24] [INFO] INFO Loading state dict from flux_utils.py:118
[2025-03-11 10:23:24] [INFO] d:\StabilityMatrix\Packages\Flux
[2025-03-11 10:23:24] [INFO] Gym\models\unet\flux1-dev.sft
[2025-03-11 10:23:24] [INFO] INFO Loaded Flux: <All keys matched flux_utils.py:137
[2025-03-11 10:23:24] [INFO] successfully>
[2025-03-11 10:23:24] [INFO] INFO Cast FLUX model to fp8. flux_train_network.py:108
[2025-03-11 10:23:24] [INFO] This may take a while.
[2025-03-11 10:23:24] [INFO] You can reduce the time
[2025-03-11 10:23:24] [INFO] by using fp8 checkpoint.
[2025-03-11 10:23:24] [INFO] /
[2025-03-11 10:23:24] [INFO] FLUXモデルをfp8に変換し
[2025-03-11 10:23:24] [INFO] ています。これには時間が
[2025-03-11 10:23:24] [INFO] かかる場合があります。fp
[2025-03-11 10:23:24] [INFO] 8チェックポイントを使用
[2025-03-11 10:23:24] [INFO] することで時間を短縮でき
[2025-03-11 10:23:24] [INFO] ます。
[2025-03-11 10:24:05] [INFO] 2025-03-11 10:24:05 INFO Building CLIP-L flux_utils.py:179
[2025-03-11 10:24:05] [INFO] INFO Loading state dict from flux_utils.py:275
[2025-03-11 10:24:05] [INFO] d:\StabilityMatrix\Packages\Flux
[2025-03-11 10:24:05] [INFO] Gym\models\clip\clip_l.safetenso
[2025-03-11 10:24:05] [INFO] rs
[2025-03-11 10:24:06] [INFO] 2025-03-11 10:24:06 INFO Loaded CLIP-L: <All keys matched flux_utils.py:278
[2025-03-11 10:24:06] [INFO] successfully>
[2025-03-11 10:24:06] [INFO] INFO Loading state dict from flux_utils.py:330
[2025-03-11 10:24:06] [INFO] d:\StabilityMatrix\Packages\Flux
[2025-03-11 10:24:06] [INFO] Gym\models\clip\t5xxl_fp16.safet
[2025-03-11 10:24:06] [INFO] ensors
[2025-03-11 10:24:06] [INFO] INFO Loaded T5xxl: <All keys matched flux_utils.py:333
[2025-03-11 10:24:06] [INFO] successfully>
[2025-03-11 10:24:06] [INFO] INFO Building AutoEncoder flux_utils.py:144
[2025-03-11 10:24:06] [INFO] INFO Loading state dict from flux_utils.py:149
[2025-03-11 10:24:06] [INFO] d:\StabilityMatrix\Packages\Flux
[2025-03-11 10:24:06] [INFO] Gym\models\vae\ae.sft
[2025-03-11 10:24:06] [INFO] INFO Loaded AE: <All keys matched flux_utils.py:152
[2025-03-11 10:24:06] [INFO] successfully>
[2025-03-11 10:24:06] [INFO] import network module: networks.lora_flux
[2025-03-11 10:24:06] [INFO] INFO [Dataset 0] train_util.py:2585
[2025-03-11 10:24:06] [INFO] INFO caching latents with caching train_util.py:1095
[2025-03-11 10:24:06] [INFO] strategy.
[2025-03-11 10:24:06] [INFO] INFO caching latents... train_util.py:1144
Version
v.2.13.4
What Operating System are you using?
Windows