-
-
Notifications
You must be signed in to change notification settings - Fork 506
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Package
Stable Diffusion WebUI Forge - Neo
When did the issue occur?
Running the Package
What GPU / hardware type are you using?
NVIDIA GeForce RTX 5070 Ti
What happened?
Forge Neo with installed SageAttention doesn't actually use it. When trying to generate an image Forge Neo throws an error and the generation is done without SageAttention. Forge Neo installation outside of SM works fine. ComfyUI inside SM with SageAttention also works fine.
Console output
Python 3.11.13 (main, Jul 23 2025, 00:29:09) [MSC v.1944 64 bit (AMD64)]
Version: neo
Installing triton
Installing sageattention
Installing gradio
Installing requirements
Installing Legacy Preprocessor Requirement: handrefinerportable
Installing Legacy Preprocessor Requirement: depth_anything
Installing Legacy Preprocessor Requirement: depth_anything_v2
Launching Web UI with arguments: --sage --pin-shared-memory --cuda-malloc --cuda-stream --skip-python-version-check --gradio-allowed-path 'D:\AI\Data\Images'
Using cudaMallocAsync backend.
Total VRAM 16303 MB, total RAM 97849 MB
pytorch version: 2.8.0+cu128
Set vram state to: NORMAL_VRAM
Always pin shared GPU memory
Device: cuda:0 NVIDIA GeForce RTX 5070 Ti : cudaMallocAsync
VAE dtype preferences: [torch.bfloat16, torch.float32] -> torch.bfloat16
CUDA Using Stream: True
Using SageAttention 2
Using PyTorch Attention for VAE
=======================================================================================
You are running torch 2.8.0+cu128, which is really outdated.
To install the latest version, run with commandline flag --reinstall-torch.
=======================================================================================
Use --skip-version-check commandline argument to disable the version check(s).
ControlNet preprocessor location: D:\AI\Data\Packages\forge-neo\models\ControlNetPreprocessor
[ControlNet] - INFO - ControlNet UI callback registered.
You do not have any model!
Model selected: {'checkpoint_info': None, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
Startup time: 58.3s (prepare environment: 37.6s, launcher: 0.6s, forge init: 11.6s, shared init: 0.4s, misc. imports: 4.7s, load scripts: 1.8s, create ui: 1.1s, gradio launch: 0.5s).
Environment vars changed: {'stream': False, 'inference_memory': 1619.2, 'pin_shared_memory': False}
Model selected: {'checkpoint_info': {'filename': 'D:\\AI\\Data\\Packages\\forge-neo\\models\\Stable-diffusion\\sd\\bananaSplitzXL_vee9PointOh.safetensors', 'hash': 'bab4cf56'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Model selected: {'checkpoint_info': {'filename': 'D:\\AI\\Data\\Packages\\forge-neo\\models\\Stable-diffusion\\sd\\bananaSplitzXL_vee9PointOh.safetensors', 'hash': 'bab4cf56'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Loading Model: {'checkpoint_info': {'filename': 'D:\\AI\\Data\\Packages\\forge-neo\\models\\Stable-diffusion\\sd\\bananaSplitzXL_vee9PointOh.safetensors', 'hash': 'bab4cf56'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Done.
[Unload] Trying to free all memory for cpu with 0 models keep loaded ... Done.
StateDict Keys: {'unet': 1680, 'vae': 248, 'text_encoder': 197, 'text_encoder_2': 518, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float16, 'computation_dtype': torch.float16}
Calculating sha256 for D:\AI\Data\Packages\forge-neo\models\Stable-diffusion\sd\bananaSplitzXL_vee9PointOh.safetensors: ad4793e2ef23f9d50ff64b5b18c79bb0e003026301715d1cdc1f2524b78bea6a
Model loaded in 4.9s (unload existing model: 0.2s, forge model load: 0.9s, calculate hash: 3.9s).
[Unload] Trying to free 4541.61 MB for cuda:0 with 0 models keep loaded ... Done.
[Memory Management] Target: JointTextEncoder, Free GPU: 14997.00 MB, Model Require: 1752.68 MB, Previously Loaded: 0.00 MB, Inference Require: 2438.40 MB, Remaining: 10805.92 MB, Moving model(s) has taken 0.41 seconds
[Unload] Trying to free 2438.40 MB for cuda:0 with 1 models keep loaded ... Done.
[Unload] Trying to free 10470.39 MB for cuda:0 with 0 models keep loaded ... Done.
[Memory Management] Target: KModel, Free GPU: 13177.42 MB, Model Require: 4897.06 MB, Previously Loaded: 0.00 MB, Inference Require: 2438.40 MB, Remaining: 5841.96 MB, Moving model(s) has taken 1.68 seconds
0%| | 0/24 [00:00<?, ?it/s]attention_sage: AssertionError
Traceback (most recent call last):
File "D:\AI\Data\Packages\forge-neo\backend\attention.py", line 377, in attention_sage
out = sageattn(q, k, v, attn_mask=mask, is_causal=False, tensor_layout=tensor_layout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI\Data\Packages\forge-neo\venv\Lib\site-packages\sageattention\core.py", line 159, in sageattn
return sageattn_qk_int8_pv_fp8_cuda(q, k, v, tensor_layout=tensor_layout, is_causal=is_causal, qk_quant_gran="per_warp", sm_scale=sm_scale, return_lse=return_lse, pv_accum_dtype=pv_accum_dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI\Data\Packages\forge-neo\venv\Lib\site-packages\sageattention\core.py", line 705, in sageattn_qk_int8_pv_fp8_cuda
assert SM89_ENABLED, "SM89 kernel is not available. Make sure you GPUs with compute capability 8.9."
^^^^^^^^^^^^
AssertionError: SM89 kernel is not available. Make sure you GPUs with compute capability 8.9.
100%|██████████| 24/24 [00:05<00:00, 4.37it/s]
[Unload] Trying to free 5194.33 MB for cuda:0 with 0 models keep loaded ... Done.
[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 8263.99 MB, Model Require: 159.56 MB, Previously Loaded: 0.00 MB, Inference Require: 2438.40 MB, Remaining: 5666.03 MB, Moving model(s) has taken 0.02 seconds
Total progress: 100%|██████████| 24/24 [00:05<00:00, 4.35it/s]
Version
2.15.5
What Operating System are you using?
Windows
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working