Support [vLLM-FSDP off-policy importance sampling correction](https://fengyao.notion.site/off-policy-rl) <img width="1718" height="764" alt="Image" src="https://github.com/user-attachments/assets/3c5381ca-2d31-4fd2-9fc3-0807540d0228" />