-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Looks like kontext is properly layered and comfyui implementation has some smart offloading going in the background.
I gave it a shot at upscaling with bf16 kontext to 2048x2656 and it was surprisingly fast. Unfortunately it has no Mixture of Diffusers implementation for flux, and then I saw that you saved the day again and implemented it in Forge. Thanks btw.
Yet on comfy I get 3.77s/it and on Forge I get 8.97s/it with my 4090. Sage+triton installed for both. Swap to shared, tried various gpu memory quotas - this was maximum that I've got. Any idea why 2 times difference in inference speed?
Metadata
Metadata
Assignees
Labels
No labels