-
Notifications
You must be signed in to change notification settings - Fork 419
Migrate Gpt-OSS to NNX #2441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate Gpt-OSS to NNX #2441
Conversation
b92fe79 to
cd60498
Compare
cd60498 to
4192792
Compare
|
I see you added gemini-review flag, and it didn't work. We have a rule to check if this branch is forked version here |
I wasn’t aware of that rule, thank you for pointing it out! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hsuan-lun-chiang. Could you please run train (you already have this), decode, and then maxengine/jetstream (with profiles collected for maxengine/jetstream)? Similar to #2088
cddd713 to
d200e51
Compare
Additional TestsEnvironmentMachine Type: TPU TrainExecuted Command: Logs: DecodeExecuted Command: Logs: MaxEngine / JetStreamStep 1: Launch MaxEngine:Step 2: Launch JetStreamStep 3: Output CollectionLogs: |
Thanks @ecnal-cienet for running these tests. The Maxengine/Jetstream profiles looks a bit off. Could you try starting the profiles after the Jetstream warmup requests have finished and the real requests have started? That is the portion we want to profile. Let me know if you have already done this and we can discuss further |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. Left a comment on the profile collection
|
Hi @bvandermoon, I re-tested JetStream profiling and already updated the xprof links of my previous comment. Thanks. If there is still issues, please let me know. Thanks. |
Awesome, thank you @ecnal-cienet. The profiles look good to me now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you @hsuan-lun-chiang and @ecnal-cienet. @cgarciae could you please take a look as well?
Description
Migrate Gpt-OSS implementation from Linen to NNX.
Tests
Ran train command to train gpt-oss for 20 steps:
Logs:
Logs - Before Migration
Logs - After Migration
Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.