-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New Scheduler: add Euler Ancestral Scheduler to StableDiffusionPipeline #636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
awesome!!!! |
This is awesome! I tried running it and got some numpy errors like this
Do you know with which version of numpy this is working? |
Hi @nielsrolf, Thank you for the feedback.
I fixed a bug in my code related to mixing numpy array and tensor in the same operation, the issue arises only when the device is set to cuda. see if that fix it for you. I will refactor the code soon according to the new redesign in #637 and #719 |
Super nice PR 😍 @anton-l @patil-suraj does any of you have time to dive into this? Otherwise happy to take a look |
Really cool @AbdullahAlfaraj so nice to see someone working on this! I've also been in the process of adding Euler ancestral, regular euler and the other dpm solvers and was curious about something in your implementation - hoping you can shed some light. I could be wrong but it seems that you omit the CFGDenoiser for the preconditioning of the denoiser model, and instead substitute for a different formulation of a discrete noise schedule that you use to precondition with. Wondering why you made this decision? And if there are any resources/papers on this this we could check out? It seems to deviate a bit from k-diffusers. Thanks! |
Let us know if you need help @AbdullahAlfaraj :-) |
Hi, @tonetechnician
I couldn't find the Euler Ancestral Algorithm described in https://arxiv.org/abs/2206.00364 paper so I decided to adapt the implementation from the https://github.com/crowsonkb/k-diffusion repo. As for the CFGDenoiser I checked the deforum stable diffusion implementation: class CFGDenoiser(nn.Module):
def __init__(self, model):
super().__init__()
self.inner_model = model
def forward(self, x, sigma, uncond, cond, cond_scale):
x_in = torch.cat([x] * 2)#A# concat the latent
sigma_in = torch.cat([sigma] * 2) #A# concat sigma
cond_in = torch.cat([uncond, cond])
uncond, cond = self.inner_model(x_in, sigma_in, cond=cond_in).chunk(2)
return uncond + (cond - uncond) * cond_scale #classifier-free guidance I'm not sure which part of this code I've missed.
I'm mostly following deforum/stable-diffusion repo. from what I understand is that they wrap the diffusion model in a I tried to follow the implementation very closely, though I could have messed up something. For all it's worth, when you inference stable diffusion using Euler Ancestral scheduler you get similar results, in both Deforum and Diffusers.
that's awesome, if you make a PR I won't mind closing this one, and continue the development on your branch |
Hi @patrickvonplaten |
Thanks for the feedback @AbdullahAlfaraj! I think we're on the same page here now. I wasn't sure why the CFGDenoiser was commented out in your code, but after searching through crowson and deforum's repo I saw where you were getting those values. I think we're on a similar path, I've just been integrating more closely with the k_diffusion library by wrapping their functions rather than pulling them out and adding to the scheduler. I found that k_diffusion seemed to default to dpm solvers + the karras paper's noise schedule. I went through a process last night trying out the different noise schedules in the paper you referenced to no good avail which is likely to my bad CFGDenoiser implementation. I do think there is room to provide an option to do different noise schedule generation for diffusers down the line though. Haven't reached a point where I'm happy to make a PR yet, so I will continue my side and feedback if I make some progress. Nice to be working in tandem with you on this though because we can discuss it a bit more |
See Katherine's note on #277 (comment)
diffusers implements DDPM's method in https://github.com/huggingface/diffusers/blob/main/src/diffusers/schedulers/scheduling_ddpm.py What distinguishes the scheduler in this PR from that one? |
FYI to the thread, my colleague just showed me that there is already an 0.3.0 (incorrectly labelled 0.5.1) implementation of diffusers that incorporates these k schedulers done by @hlky. Check it out. Going to test it now |
There are seemingly a few changes to make with the current diffusers Scheduler Mixins and new API. But have started this process. After I get somewhere I'd be happy to make a PR that has these changes. |
@tonetechnician @hafriedlander had been using my implementation of k_euler and k_euler_a in stable-diffusion-grpcserver. These 2 were already producing 1:1 result. k_dpm_2, k_dpm_2_a and k_heun were not quite right because in diffusers the call to unet is done in the pipeline, however these samplers do a second pass (this is why they take twice as long k_dpm_2_a k_dpm_2_a and k_heun are now producing 1:1 result; we still need to check what's going on with As soon as there is a complete implementation that produces 1:1 results we will submit a pull request. If anyone has any questions or wants to help, feel free to reach out to me or @hafriedlander. (below is just some info on what is equivalent between crownsonkb's implementation and diffusers')
=
|
Thanks so much for the response @hlky!
This is just in the README, currently it's referencing the huggingface diffusers version badge https://github.com/hlky/diffusers/blob/main/README.md?plain=1#L11 Really promising to hear the 1:1 output from your implementation and the original. Going to be playing with it a bit this evening. I'm happy to help port it to 0.5.1.
I think then I'll fork your repo and start making changes there. Happy to help where I can. Will also have a look at dpm_2 and was also considering implementing the karras noise schedule for completeness. |
@hafriedlander finished off the implementation earlier k_euler, k_euler_a, k_dpm_2, k_dpm_2_a and k_heun are 1:1 result |
Amazing! Defs gonna try this. I've just merged 0.5.1 with your fork and testing it out to make sure everything works. Will incorporate these here and if all goes well, can make a PR to your repo |
@tonetechnician @hlky's fork is pretty old now. My repo is based on Diffusers 0.4.2, so upgrading shouldn't be too painful. However I've backported 0.3.0-style scheduler support into my repo to continue to support these schedulers. It's probably the main blocker to raising a PR here - they're all still NumPy based and need specific handling. I'll probably raise a PR for discussion anyway, but updating the schedulers to be more 0.4.2-style would be helpful. I'm not quite sure how to handle scale_model_input yet. |
Thanks for the info @hafriedlander.
Indeed, it seems alright I think. I've started porting your 0.4.2 versions to 0.5.1 but notice the output isn't quite correct. I need to dig a bit more into why exactly. I get output like this (prompt is "A dog smiling"). Might be to do with precision types or something like this.
I see! Yes, I've now converted the schedulers you've implemented to be torch specific and this seems to be working fine (minus the strange image output)
Awesome, I think this would be great. As for scale_model_input, it's not too bad. I've adopted the method used by other schedulers that don't scale input:
Please do link the PR here once you've got it. I'm currently working from @hlky's fork, but I've copied your implementation of the schedulers over there already. We will also need to work on the Img2Img pipeline too as that currently is not working correctly. Busy on that at the moment. After that is sorted I'm planning to complete the upgrade to 0.5.1 |
If you're using my update of the schedulers, but not my unified_pipeline, it probably won't work. (That's my guess for why camo-dog).There's quite a few special handling cases in the pipeline (the schedulers need to re-call the UNet, and pass custom sigma values) |
Great to know, thanks for pointing it out. I was reading through both yours and @hlky's implementation and see there are a notable changes to accommodate the schedulers. But so awesome that you guys have got it going! I think porting to 0.5.1 isn't going to be difficult. It is quite late where I am now so I'll pick it up first thing in the morning with a fresh mind and get it working. |
Just a quick update, I have got the new schedulers incorporated in into the other pipelines from @hlky's repo and been testing and so far so good! Now going to complete the update to 0.5.1. Once that's done I'll make a PR to the diffusers base repo and link it here. But looking good. Next thing will be getting tests going. |
That's awesome @tonetechnician & @AbdullahAlfaraj |
Also, feel free to already open a PR if you want :) Pretty excited to try it out !! |
Thanks! @patil-suraj Had a small hiccup with some of the implementation and have made an issue #901 and speaking with @patrickvonplaten. But I'll make a PR later today with where I got to so far. Getting output in 0.5.1 that looks reasonable but need more testing as there does seem to be some changes. As both @hlky and @hafriedlander have mentioned in other posts, I do see that for the DPM schedulers we do need to pipe the model back into scheduler, so for them to work with the base diffusers pipelines, we'll probably need to make some adjustments to how this works which can be discussed in the PR. Don't think it should be too difficult to add that in a non-destructive way |
Hey @tonetechnician, This is a great PR and we would very much like to help/unblock you if you have any questions! |
@patrickvonplaten tricky to work around that constraint The scheduler calculates a predicted state, and then re-calculates the noise prediction for the next step based on that intermediate state. The other way to achieve that could be the scheduler returning some sort of flag to the model that it's not done, and needs re-calling with some (opaque) state object, plus the result of a second call to the unet? Pseudo-code
Or I guess the scheduler could lie about the timesteps, and double them up to just cause itself to be called twice, remembering it's own state internally. That might even be better? There's no branching logic in the scheduler (except for handling the last step where there is no next step to re-call on). |
@patil-suraj do you want to take a look here? |
On my todo-list, will get back to this as soon as possible. |
Maybe cc @anton-l in case you find some time |
Linking this to #944 -> we still need to put some thought into how to add the 2 model eval per step schedulers. Hope to have updates on this soon. |
Euler ancestral is now merged. Closing this one :-) |
Description:
I've added the Euler Ancestral Scheduler to the Stable Diffusion Pipeline.
The code has been adapted from: https://github.com/crowsonkb/k-diffusion
the scheduler currently works; however, I need to clean it up and document it which I will do soon.
Checklist: