FluxGym - freezes after a few seconds (with no GPU activity)

### Package

FluxGym , commit: main@284ddc7 

### When did the issue occur?

Running the Package

### What GPU / hardware type are you using?

RTX 4070ti Super 16Gb

### What happened?

When I press "Train", after doing all the preparation (including VAE, SFT and CLIP files), a few lines are executed in the console and then the program hangs forever without showing any activity in the TaskManager. (The Forge program works fine.)

Also, when launching FluxGym, its Stability Matrix text windows says "The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable."
(manual installing bitsandbytes-windows did not fixed the subject of this report)

### Console output

[2025-03-11 10:23:18] [INFO] Running d:\StabilityMatrix\Packages\FluxGym\outputs\fdsf-jk-kljl\train.bat
[2025-03-11 10:23:18] [INFO] 
[2025-03-11 10:23:18] [INFO] d:\StabilityMatrix\Packages\FluxGym>accelerate launch   --mixed_precision bf16   --num_cpu_threads_per_process 1   sd-scripts/flux_train_network.py   --pretrained_model_name_or_path "d:\StabilityMatrix\Packages\FluxGym\models\unet\flux1-dev.sft"   --clip_l "d:\StabilityMatrix\Packages\FluxGym\models\clip\clip_l.safetensors"   --t5xxl "d:\StabilityMatrix\Packages\FluxGym\models\clip\t5xxl_fp16.safetensors"   --ae "d:\StabilityMatrix\Packages\FluxGym\models\vae\ae.sft"   --cache_latents_to_disk   --save_model_as safetensors   --sdpa --persistent_data_loader_workers   --max_data_loader_n_workers 2   --seed 42   --gradient_checkpointing   --mixed_precision bf16   --save_precision bf16   --network_module networks.lora_flux   --network_dim 4   --optimizer_type adafactor   --optimizer_args "relative_step=False" "scale_parameter=False" "warmup_init=False"   --lr_scheduler constant_with_warmup   --max_grad_norm 0.0 --sample_prompts="d:\StabilityMatrix\Packages\FluxGym\outputs\fdsf-jk-kljl\sample_prompts.txt" --sample_every_n_steps="300"   --learning_rate 8e-4   --cache_text_encoder_outputs   --cache_text_encoder_outputs_to_disk   --fp8_base   --highvram   --max_train_epochs 16   --save_every_n_epochs 4   --dataset_config "d:\StabilityMatrix\Packages\FluxGym\outputs\fdsf-jk-kljl\dataset.toml"   --output_dir "d:\StabilityMatrix\Packages\FluxGym\outputs\fdsf-jk-kljl"   --output_name fdsf-jk-kljl   --timestep_sampling shift   --discrete_flow_shift 3.1582   --model_prediction_type raw   --guidance_scale 1   --loss_type l2
[2025-03-11 10:23:24] [INFO] 2025-03-11 10:23:24 INFO     highvram is enabled /           train_util.py:4292
[2025-03-11 10:23:24] [INFO] highvramが有効です
[2025-03-11 10:23:24] [INFO] WARNING  cache_latents_to_disk is        train_util.py:4309
[2025-03-11 10:23:24] [INFO] enabled, so cache_latents is
[2025-03-11 10:23:24] [INFO] also enabled /
[2025-03-11 10:23:24] [INFO] cache_latents_to_diskが有効なた
[2025-03-11 10:23:24] [INFO] め、cache_latentsを有効にします
[2025-03-11 10:23:24] [INFO] 2025-03-11 10:23:24 INFO     Checking the state dict:          flux_utils.py:43
[2025-03-11 10:23:24] [INFO] Diffusers or BFL, dev or schnell
[2025-03-11 10:23:24] [INFO] INFO     t5xxl_max_token_length:  flux_train_network.py:157
[2025-03-11 10:23:24] [INFO] 512
[2025-03-11 10:23:24] [INFO] d:\StabilityMatrix\Packages\FluxGym\venv\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
[2025-03-11 10:23:24] [INFO] warnings.warn(
[2025-03-11 10:23:24] [INFO] You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
[2025-03-11 10:23:24] [INFO] INFO     Loading dataset config from   train_network.py:488
[2025-03-11 10:23:24] [INFO] d:\StabilityMatrix\Packages\F
[2025-03-11 10:23:24] [INFO] luxGym\outputs\fdsf-jk-kljl\d
[2025-03-11 10:23:24] [INFO] ataset.toml
[2025-03-11 10:23:24] [INFO] INFO     prepare images.                 train_util.py:2049
[2025-03-11 10:23:24] [INFO] INFO     get image size from name of     train_util.py:1942
[2025-03-11 10:23:24] [INFO] cache files
[2025-03-11 10:23:24] [INFO] 0%|          | 0/5 [00:00<?, ?it/s]
100%|██████████| 5/5 [00:00<?, ?it/s]
[2025-03-11 10:23:24] [INFO] INFO     set image size from cache       train_util.py:1972
[2025-03-11 10:23:24] [INFO] files: 0/5
[2025-03-11 10:23:24] [INFO] INFO     found directory                 train_util.py:1996
[2025-03-11 10:23:24] [INFO] d:\StabilityMatrix\Packages\Flu
[2025-03-11 10:23:24] [INFO] xGym\datasets\fdsf-jk-kljl
[2025-03-11 10:23:24] [INFO] contains 5 image files
[2025-03-11 10:23:24] [INFO] read caption:   0%|          | 0/5 [00:00<?, ?it/s]
read caption: 100%|██████████| 5/5 [00:00<00:00, 755.68it/s]
[2025-03-11 10:23:24] [INFO] INFO     50 train images with repeats.   train_util.py:2092
[2025-03-11 10:23:24] [INFO] INFO     0 reg images with repeats.      train_util.py:2096
[2025-03-11 10:23:24] [INFO] WARNING  no regularization images /      train_util.py:2101
[2025-03-11 10:23:24] [INFO] 正則化画像が見つかりませんでし
[2025-03-11 10:23:24] [INFO] た
[2025-03-11 10:23:24] [INFO] INFO     [Dataset 0]                     config_util.py:575
[2025-03-11 10:23:24] [INFO] batch_size: 1
[2025-03-11 10:23:24] [INFO] resolution: (512, 512)
[2025-03-11 10:23:24] [INFO] enable_bucket: False
[2025-03-11 10:23:24] [INFO] 
[2025-03-11 10:23:24] [INFO] [Subset 0 of Dataset 0]
[2025-03-11 10:23:24] [INFO] image_dir:
[2025-03-11 10:23:24] [INFO] "d:\StabilityMatrix\Packages\Fl
[2025-03-11 10:23:24] [INFO] uxGym\datasets\fdsf-jk-kljl"
[2025-03-11 10:23:24] [INFO] image_count: 5
[2025-03-11 10:23:24] [INFO] num_repeats: 10
[2025-03-11 10:23:24] [INFO] shuffle_caption: False
[2025-03-11 10:23:24] [INFO] keep_tokens: 1
[2025-03-11 10:23:24] [INFO] caption_dropout_rate: 0.0
[2025-03-11 10:23:24] [INFO] caption_dropout_every_n_epo
[2025-03-11 10:23:24] [INFO] chs: 0
[2025-03-11 10:23:24] [INFO] caption_tag_dropout_rate:
[2025-03-11 10:23:24] [INFO] 0.0
[2025-03-11 10:23:24] [INFO] caption_prefix: None
[2025-03-11 10:23:24] [INFO] caption_suffix: None
[2025-03-11 10:23:24] [INFO] color_aug: False
[2025-03-11 10:23:24] [INFO] flip_aug: False
[2025-03-11 10:23:24] [INFO] face_crop_aug_range: None
[2025-03-11 10:23:24] [INFO] random_crop: False
[2025-03-11 10:23:24] [INFO] token_warmup_min: 1,
[2025-03-11 10:23:24] [INFO] token_warmup_step: 0,
[2025-03-11 10:23:24] [INFO] alpha_mask: False
[2025-03-11 10:23:24] [INFO] custom_attributes: {}
[2025-03-11 10:23:24] [INFO] is_reg: False
[2025-03-11 10:23:24] [INFO] class_tokens:
[2025-03-11 10:23:24] [INFO] sdfjklsdkfjsdlkf
[2025-03-11 10:23:24] [INFO] caption_extension: .txt
[2025-03-11 10:23:24] [INFO] 
[2025-03-11 10:23:24] [INFO] 
[2025-03-11 10:23:24] [INFO] INFO     [Prepare dataset 0]             config_util.py:587
[2025-03-11 10:23:24] [INFO] INFO     loading image sizes.             train_util.py:970
[2025-03-11 10:23:24] [INFO] 0%|          | 0/5 [00:00<?, ?it/s]
100%|██████████| 5/5 [00:00<00:00, 333.63it/s]
[2025-03-11 10:23:24] [INFO] INFO     prepare dataset                  train_util.py:995
[2025-03-11 10:23:24] [INFO] INFO     preparing accelerator         train_network.py:562
[2025-03-11 10:23:24] [INFO] accelerator device: cpu
[2025-03-11 10:23:24] [INFO] INFO     Checking the state dict:          flux_utils.py:43
[2025-03-11 10:23:24] [INFO] Diffusers or BFL, dev or schnell
[2025-03-11 10:23:24] [INFO] INFO     Building Flux model dev from BFL flux_utils.py:101
[2025-03-11 10:23:24] [INFO] checkpoint
[2025-03-11 10:23:24] [INFO] INFO     Loading state dict from          flux_utils.py:118
[2025-03-11 10:23:24] [INFO] d:\StabilityMatrix\Packages\Flux
[2025-03-11 10:23:24] [INFO] Gym\models\unet\flux1-dev.sft
[2025-03-11 10:23:24] [INFO] INFO     Loaded Flux: <All keys matched   flux_utils.py:137
[2025-03-11 10:23:24] [INFO] successfully>
[2025-03-11 10:23:24] [INFO] INFO     Cast FLUX model to fp8.  flux_train_network.py:108
[2025-03-11 10:23:24] [INFO] This may take a while.
[2025-03-11 10:23:24] [INFO] You can reduce the time
[2025-03-11 10:23:24] [INFO] by using fp8 checkpoint.
[2025-03-11 10:23:24] [INFO] /
[2025-03-11 10:23:24] [INFO] FLUXモデルをfp8に変換し
[2025-03-11 10:23:24] [INFO] ています。これには時間が
[2025-03-11 10:23:24] [INFO] かかる場合があります。fp
[2025-03-11 10:23:24] [INFO] 8チェックポイントを使用
[2025-03-11 10:23:24] [INFO] することで時間を短縮でき
[2025-03-11 10:23:24] [INFO] ます。
[2025-03-11 10:24:05] [INFO] 2025-03-11 10:24:05 INFO     Building CLIP-L                  flux_utils.py:179
[2025-03-11 10:24:05] [INFO] INFO     Loading state dict from          flux_utils.py:275
[2025-03-11 10:24:05] [INFO] d:\StabilityMatrix\Packages\Flux
[2025-03-11 10:24:05] [INFO] Gym\models\clip\clip_l.safetenso
[2025-03-11 10:24:05] [INFO] rs
[2025-03-11 10:24:06] [INFO] 2025-03-11 10:24:06 INFO     Loaded CLIP-L: <All keys matched flux_utils.py:278
[2025-03-11 10:24:06] [INFO] successfully>
[2025-03-11 10:24:06] [INFO] INFO     Loading state dict from          flux_utils.py:330
[2025-03-11 10:24:06] [INFO] d:\StabilityMatrix\Packages\Flux
[2025-03-11 10:24:06] [INFO] Gym\models\clip\t5xxl_fp16.safet
[2025-03-11 10:24:06] [INFO] ensors
[2025-03-11 10:24:06] [INFO] INFO     Loaded T5xxl: <All keys matched  flux_utils.py:333
[2025-03-11 10:24:06] [INFO] successfully>
[2025-03-11 10:24:06] [INFO] INFO     Building AutoEncoder             flux_utils.py:144
[2025-03-11 10:24:06] [INFO] INFO     Loading state dict from          flux_utils.py:149
[2025-03-11 10:24:06] [INFO] d:\StabilityMatrix\Packages\Flux
[2025-03-11 10:24:06] [INFO] Gym\models\vae\ae.sft
[2025-03-11 10:24:06] [INFO] INFO     Loaded AE: <All keys matched     flux_utils.py:152
[2025-03-11 10:24:06] [INFO] successfully>
[2025-03-11 10:24:06] [INFO] import network module: networks.lora_flux
[2025-03-11 10:24:06] [INFO] INFO     [Dataset 0]                     train_util.py:2585
[2025-03-11 10:24:06] [INFO] INFO     caching latents with caching    train_util.py:1095
[2025-03-11 10:24:06] [INFO] strategy.
[2025-03-11 10:24:06] [INFO] INFO     caching latents...              train_util.py:1144

### Version

v.2.13.4

### What Operating System are you using?

Windows

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FluxGym - freezes after a few seconds (with no GPU activity) #1185

Package

When did the issue occur?

What GPU / hardware type are you using?

What happened?

Console output

Version

What Operating System are you using?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FluxGym - freezes after a few seconds (with no GPU activity) #1185

Description

Package

When did the issue occur?

What GPU / hardware type are you using?

What happened?

Console output

Version

What Operating System are you using?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions