-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Implement Gpt oss #3129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Implement Gpt oss #3129
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this! 👏
Generally this looks good to me, but there are a couple things I'd like to address.
I'll also ask @EricLBuehler to have a look, since he has been working on an implementation which includes mxfp4 support 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this binary 👍
| self.sinks | ||
| .reshape((1, self.num_heads, 1, 1))? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this reshape could be done just once in Attention::new, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really agree that this proves correctness. Best to remove this as well.
If you want to add some tests you could add something to gpt_oss.rs where you manually create a config and varbuilder and verify (for example) that the correct expert is chosen.
|
I wanted to try this but couldn't even compile the example, and on the surface it seems to be just syntax errors, but trying to address the syntax errors myself, it seems like fundamentally it hasn't been completely implemented? Doesn't seem to be a hardware compatibility issue either, Rust straight up cannot compile this. |
The dangers of vibe coding. Haven't tested it myself so thank you for taking initiative. If you want to try and complete the implementation you are very welcome to do so @maximizemaxwell :) |
Judging by the commits, that seems to be the case indeed :) I initially found this as I've started my own GPT-OSS implementation with Candle too and I've already ended up with way more code than this and got curious how there is so little here! |
Implement GPT OSS on candle