SDXL crash with embeddings and clip on CPU

On master-10c6501, SDXL embeddings crash with an assertion failure, either on a CPU build or when passing --clip-on-cpu. The Vulkan backend works fine.

The following test is with CyberRealisticPony_v7 and its positive embedding CyberRealisticPony_POSV1, but every model+embedding combination I tried seemed to crash in the same way, on ggml-cpu.c:

```text
./sd --model ./cyberrealisticPony_v7.safetensors --embd-dir . -p CyberRealisticPony_POSV1 --steps 1 --cfg-scale 1 --clip-on-cpu
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon Vega 11 Graphics (RADV RAVEN) (radv) | uma: 1 | fp16: 1 | warp size: 64 | shared memory: 65536 | matrix cores: none
[INFO ] stable-diffusion.cpp:197  - loading model from './cyberrealisticPony_v7.safetensors'
[INFO ] model.cpp:908  - load ./cyberrealisticPony_v7.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:244  - Version: SDXL 
[INFO ] stable-diffusion.cpp:277  - Weight type:                 f16
[INFO ] stable-diffusion.cpp:278  - Conditioner weight type:     f16
[INFO ] stable-diffusion.cpp:279  - Diffusion model weight type: f16
[INFO ] stable-diffusion.cpp:280  - VAE weight type:             f32
[WARN ] stable-diffusion.cpp:287  - !!!It looks like you are using SDXL model. If you find that the generated images are completely black, try specifying SDXL VAE FP16 Fix with the --vae parameter. You can find it here: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors
[INFO ] stable-diffusion.cpp:324  - CLIP: Using CPU backend
  |==================================================| 2641/2641 - 500.00it/s
[INFO ] stable-diffusion.cpp:503  - total params memory size = 6751.89MB (VRAM 4994.54MB, RAM 1757.36MB): clip 1757.36MB(RAM), unet 4900.07MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(RAM)
[INFO ] stable-diffusion.cpp:522  - loading model from './cyberrealisticPony_v7.safetensors' completed, taking 6.45s
[INFO ] stable-diffusion.cpp:556  - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:690  - Attempting to apply 0 LoRAs
[INFO ] stable-diffusion.cpp:1246 - apply_loras completed, taking 0.00s
[INFO ] model.cpp:908  - load ./CyberRealisticPony_POSV1.safetensors using safetensors format
  |==================================================| 2/2 - 0.00it/s
/tmp/sdcpp/ggml/src/ggml-cpu/ggml-cpu.c:9684: /tmp/sdcpp/ggml/src/ggml-cpu/ggml-cpu.c:9684: /tmp/sdcpp/ggml/src/ggml-cpu/ggml-cpu.c:9684: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed/tmp/sdcpp/ggml/src/ggml-cpu/ggml-cpu.c:9684: GGML_ASSERT(i01 >= 0 && i01 < ne01) failedGGML_ASSERT(i01 >= 0 && i01 < ne01) failed

GGML_ASSERT(i01 >= 0 && i01 < ne01) failed

[New LWP 82838]
[New LWP 82845]
[New LWP 82846]
[New LWP 82847]
warning: process 82837 is already traced by process 82848
ptrace: Operação não permitida.
No stack.
The program is not being run.
warning: process 82837 is already traced by process 82848
ptrace: Operação não permitida.
No stack.
The program is not being run.
warning: process 82837 is already traced by process 82848
ptrace: Operação não permitida.
No stack.
The program is not being run.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f85d30f2c17 in __GI___wait4 (pid=82848, stat_loc=0x7ffed6595704, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30      ../sysdeps/unix/sysv/linux/wait4.c: Arquivo ou diretório inexistente.
#0  0x00007f85d30f2c17 in __GI___wait4 (pid=82848, stat_loc=0x7ffed6595704, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
30      in ../sysdeps/unix/sysv/linux/wait4.c
#1  0x0000562553811811 in ggml_abort ()
#2  0x000056255376434c in ggml_compute_forward_get_rows ()
#3  0x000056255378588d in ggml_graph_compute_thread.isra ()
#4  0x00007f85d35b60b6 in GOMP_parallel () from /lib/x86_64-linux-gnu/libgomp.so.1
#5  0x00005625537883af in ggml_graph_compute ()
#6  0x00005625537887e2 in ggml_backend_cpu_graph_compute(ggml_backend*, ggml_cgraph*) ()
#7  0x0000562553827edc in ggml_backend_graph_compute ()
#8  0x0000562553693469 in GGMLRunner::compute(std::function<ggml_cgraph* ()>, int, bool, ggml_tensor**, ggml_context*) ()
#9  0x0000562553697bab in FrozenCLIPEmbedderWithCustomWords::get_learned_condition_common(ggml_context*, int, std::vector<int, std::allocator<int> >&, std::vector<float, std::allocator<float> >&, int, int, int, int, bool) ()
#10 0x00005625537132dc in FrozenCLIPEmbedderWithCustomWords::get_learned_condition(ggml_context*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, int, int, int, bool) ()
#11 0x0000562553686818 in generate_image(sd_ctx_t*, ggml_context*, ggml_tensor*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, float, float, float, int, int, sample_method_t, std::vector<float, std::allocator<float> > const&, long, int, sd_image_t const*, float, float, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<int, std::allocator<int> >, float, float, float, ggml_tensor*) ()
#12 0x0000562553689481 in txt2img ()
#13 0x00005625535f5173 in main ()
[Inferior 1 (process 82837) detached]
Abortado (imagem do núcleo gravada)
```

Disabling the assertion avoids the crash, so I'm guessing the CPU backend is catching a non-critical error ignored by the Vulkan backend.

@stduhpf , this seems related to your SDXL embeddings fix?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SDXL crash with embeddings and clip on CPU #656

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

SDXL crash with embeddings and clip on CPU #656

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions