Description
Description & Motivation
I'm using lightning for my project using PPO algorithm.
During an optimisation step, both inference and training are needed, so it's slow with LLMs.
Recently, DeepSpeed introduced a HybridEngine for this scenario, with the benefits quoted below:
The Hybrid-Engine is capable of seamlessly transitioning between inference and training modes within RLHF, allowing it to leverage various optimizations from DeepSpeed-Inference such as tensor-parallelism and high-performance transformer kernels for generation...
Pitch
I wonder if you have any plan to add support for this new hybrid engine?
Alternatives
A pointer to tutorials on how to add custom engines/strategies are also much appreciated.
I can create a PR afterwards if my project allows.
Additional context
Link to DeepSpeedHybridEngine
cc @Borda @awaelchli