Replies: 2 comments
-
you need to update to the latest git main, i believe. that problem has been solved. but yeah, FFT across 48G GPUs likely needs DeepSpeed... it's an 8B parameter model! you're confusing it with the 2.5B parameter one. |
Beta Was this translation helpful? Give feedback.
0 replies
-
SD3 uses its own 16ch VAE. please be sure to follow the SD3 quickstart. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I faced some issues when training SD3.
1: SD3 can't cache VAE cause self.transform_sample is None:
I tried to add a func named get_transforms in helpers.models.common.py, which is copy from VideoModelFoundation class to ImageModelFoundation class,. I don't know if this is the right solution. It seems to solve the VAE cache problem, but what happens next doesn’t seem right.
func:
Great thanks for your work!
config.json
{
"--resume_from_checkpoint": "latest",
"--data_backend_config": "config/multidatabackend.json",
"--aspect_bucket_rounding": 2,
"--seed": 42,
"--minimum_image_size": 0,
"--disable_benchmark": false,
"--output_dir": "output/models",
"--max_train_steps": 10000,
"--num_train_epochs": 0,
"--checkpointing_steps": 500,
"--checkpoints_total_limit": 5,
"--attention_mechanism": "diffusers",
"--tracker_project_name": "full-training",
"--tracker_run_name": "simpletuner-full",
"--report_to": "tensorboard",
"--model_type": "full",
"--pretrained_model_name_or_path": "stabilityai/stable-diffusion-3.5-large",
"--model_family": "sd3",
"--train_batch_size": 1,
"--gradient_checkpointing": "true",
"--caption_dropout_probability": 0.1,
"--resolution_type": "pixel_area",
"--resolution": 256,
"--validation_seed": 42,
"--validation_steps": 500,
"--validation_resolution": "256x256",
"--validation_guidance": 5.0,
"--validation_guidance_rescale": "0.0",
"--validation_num_inference_steps": "20",
"--validation_prompt": "A walking dog.",
"--mixed_precision": "bf16",
"--optimizer": "adamw_bf16",
"--learning_rate": "1e-6",
"--lr_scheduler": "polynomial",
"--lr_warmup_steps": 100,
"--base_model_precision": "no_change",
"--validation_torch_compile": "false"
}
multidatabackend.json
[
{
"id": "text-embed-cache",
"dataset_type": "text_embeds",
"default": true,
"type": "local",
"cache_dir": "/vol/SimpleTuner/cache/sd3/text",
"write_batch_size": 128
},
{
"id": "fac-1024",
"type": "local",
"instance_data_dir": "/home/vol/dataset/fac/image_set",
"crop": false,
"resolution_type": "pixel_area",
"metadata_backend": "discovery",
"caption_strategy": "textfile",
"cache_dir_vae": "/vol/SimpleTuner/cache/sd3/vae/1024",
"resolution": 256,
"minimum_image_size": 224,
"repeats": 1
},
{
"id": "fac-crop-1024",
"type": "local",
"instance_data_dir": "/home/vol/dataset/fac/image_set",
"crop": true,
"crop_aspect": "square",
"crop_style": "center",
"vae_cache_clear_each_epoch": false,
"resolution_type": "pixel_area",
"metadata_backend": "discovery",
"caption_strategy": "textfile",
"cache_dir_vae": "/vol/SimpleTuner/cache/sd3/vae-crop/1024",
"resolution": 256,
"minimum_image_size": 224,
"repeats": 1
}
]
Beta Was this translation helpful? Give feedback.
All reactions