Change training step to a scalar tensor so it works with CUDA graphs #842
Replies: 1 comment
-
|
Hi @jasooney23 - sorry for a delayed reply, I didn't realize there were discussions being opened! We would welcome this as a PR. In fact, small, targeted PRs like this are easier to review and easier to get accepted. There is no minimum revision size :). Good idea to use CUDA graphs. Yes, your idea can work about using a torch.Tensor for Feel free to open a PR, even in draft form, if/when you're ready. Tag me, I'd be happy to take a look. Thanks! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I was experimenting with the custom aggregator in the Turbulent Channel example and wanted to enable CUDA graphs for faster execution. However, currently
stepgets passed as a genericintfromTrainer._cuda_graph_training_step, which means that when the CUDA graph gets captured, the step it was captured at is the step the graph will always execute using.i.e., if my aggregator's
forwardtakesstepas an argument and the CUDA graph is captured atstep = 20, then the aggregator will continue to execute withstep = 20.My simple fix is just to pass
stepas a Tensor, but i'm not sure if i should submit the change myself or just let someone bundle it as part of a bigger revision? (sorry, it's my first time participating in open source stuff!)Thanks 😎
Beta Was this translation helpful? Give feedback.
All reactions