Open
Description
On master-10c6501, SDXL embeddings crash with an assertion failure, either on a CPU build or when passing --clip-on-cpu. The Vulkan backend works fine.
The following test is with CyberRealisticPony_v7 and its positive embedding CyberRealisticPony_POSV1, but every model+embedding combination I tried seemed to crash in the same way, on ggml-cpu.c:
./sd --model ./cyberrealisticPony_v7.safetensors --embd-dir . -p CyberRealisticPony_POSV1 --steps 1 --cfg-scale 1 --clip-on-cpu
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon Vega 11 Graphics (RADV RAVEN) (radv) | uma: 1 | fp16: 1 | warp size: 64 | shared memory: 65536 | matrix cores: none
[INFO ] stable-diffusion.cpp:197 - loading model from './cyberrealisticPony_v7.safetensors'
[INFO ] model.cpp:908 - load ./cyberrealisticPony_v7.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:244 - Version: SDXL
[INFO ] stable-diffusion.cpp:277 - Weight type: f16
[INFO ] stable-diffusion.cpp:278 - Conditioner weight type: f16
[INFO ] stable-diffusion.cpp:279 - Diffusion model weight type: f16
[INFO ] stable-diffusion.cpp:280 - VAE weight type: f32
[WARN ] stable-diffusion.cpp:287 - !!!It looks like you are using SDXL model. If you find that the generated images are completely black, try specifying SDXL VAE FP16 Fix with the --vae parameter. You can find it here: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors
[INFO ] stable-diffusion.cpp:324 - CLIP: Using CPU backend
|==================================================| 2641/2641 - 500.00it/s
[INFO ] stable-diffusion.cpp:503 - total params memory size = 6751.89MB (VRAM 4994.54MB, RAM 1757.36MB): clip 1757.36MB(RAM), unet 4900.07MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(RAM)
[INFO ] stable-diffusion.cpp:522 - loading model from './cyberrealisticPony_v7.safetensors' completed, taking 6.45s
[INFO ] stable-diffusion.cpp:556 - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:690 - Attempting to apply 0 LoRAs
[INFO ] stable-diffusion.cpp:1246 - apply_loras completed, taking 0.00s
[INFO ] model.cpp:908 - load ./CyberRealisticPony_POSV1.safetensors using safetensors format
|==================================================| 2/2 - 0.00it/s
/tmp/sdcpp/ggml/src/ggml-cpu/ggml-cpu.c:9684: /tmp/sdcpp/ggml/src/ggml-cpu/ggml-cpu.c:9684: /tmp/sdcpp/ggml/src/ggml-cpu/ggml-cpu.c:9684: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed/tmp/sdcpp/ggml/src/ggml-cpu/ggml-cpu.c:9684: GGML_ASSERT(i01 >= 0 && i01 < ne01) failedGGML_ASSERT(i01 >= 0 && i01 < ne01) failed
GGML_ASSERT(i01 >= 0 && i01 < ne01) failed
[New LWP 82838]
[New LWP 82845]
[New LWP 82846]
[New LWP 82847]
warning: process 82837 is already traced by process 82848
ptrace: Operação não permitida.
No stack.
The program is not being run.
warning: process 82837 is already traced by process 82848
ptrace: Operação não permitida.
No stack.
The program is not being run.
warning: process 82837 is already traced by process 82848
ptrace: Operação não permitida.
No stack.
The program is not being run.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f85d30f2c17 in __GI___wait4 (pid=82848, stat_loc=0x7ffed6595704, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30 ../sysdeps/unix/sysv/linux/wait4.c: Arquivo ou diretório inexistente.
#0 0x00007f85d30f2c17 in __GI___wait4 (pid=82848, stat_loc=0x7ffed6595704, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30 in ../sysdeps/unix/sysv/linux/wait4.c
#1 0x0000562553811811 in ggml_abort ()
#2 0x000056255376434c in ggml_compute_forward_get_rows ()
#3 0x000056255378588d in ggml_graph_compute_thread.isra ()
#4 0x00007f85d35b60b6 in GOMP_parallel () from /lib/x86_64-linux-gnu/libgomp.so.1
#5 0x00005625537883af in ggml_graph_compute ()
#6 0x00005625537887e2 in ggml_backend_cpu_graph_compute(ggml_backend*, ggml_cgraph*) ()
#7 0x0000562553827edc in ggml_backend_graph_compute ()
#8 0x0000562553693469 in GGMLRunner::compute(std::function<ggml_cgraph* ()>, int, bool, ggml_tensor**, ggml_context*) ()
#9 0x0000562553697bab in FrozenCLIPEmbedderWithCustomWords::get_learned_condition_common(ggml_context*, int, std::vector<int, std::allocator<int> >&, std::vector<float, std::allocator<float> >&, int, int, int, int, bool) ()
#10 0x00005625537132dc in FrozenCLIPEmbedderWithCustomWords::get_learned_condition(ggml_context*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, int, int, int, bool) ()
#11 0x0000562553686818 in generate_image(sd_ctx_t*, ggml_context*, ggml_tensor*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, float, float, float, int, int, sample_method_t, std::vector<float, std::allocator<float> > const&, long, int, sd_image_t const*, float, float, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<int, std::allocator<int> >, float, float, float, ggml_tensor*) ()
#12 0x0000562553689481 in txt2img ()
#13 0x00005625535f5173 in main ()
[Inferior 1 (process 82837) detached]
Abortado (imagem do núcleo gravada)
Disabling the assertion avoids the crash, so I'm guessing the CPU backend is catching a non-critical error ignored by the Vulkan backend.
@stduhpf , this seems related to your SDXL embeddings fix?
Metadata
Metadata
Assignees
Labels
No labels