-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Currently, stable-diffusion.cpp built with OpenCL support crashes on startup when using non-Qualcomm hardware. The error indicates the backend is "not optimized" for the detected GPU type (e.g., Intel HD Graphics Family) and terminates. While the current OpenCL kernels may be highly tuned for Adreno, the hard-coded requirement prevents users with other OpenCL-compliant integrated GPUs from utilizing their hardware for acceleration.
Many users rely on integrated GPUs (Intel UHD/Iris, AMD Radeon Vega/780M) which support OpenCL 1.2/2.1/3.0. Even if the current kernels are not perfectly optimized for these architectures, the parallel nature of GPU execution will provide a much-needed performance boost over slow CPU-only inference.
Suggested Changes:
Relax Device Validation: Change the initialization logic from a whitelist (Adreno-only) to a capability check (checking for required OpenCL extensions/versions).
Generic Fallback: Provide a "Generic GPU" path that allows the code to run on any CL_DEVICE_TYPE_GPU even if specific performance tuning parameters (like work-group sizes) are not yet profiled for that hardware.
Informational Warning: Instead of crashing, display a warning: "Warning: GPU type not officially optimized; performance may vary."