Hi there,
First of all, thank you for the excellent project!
LLaVA-OneVision-1.5 is a multimodal model that combines Rice ViT with Qwen3 LLM. The project has open-sourced all training code, datasets, and model weights, making it a great resource. Its reinforcement learning (RL) code is also built upon AReaL (see here), but it uses an early version of AReaL.
Would it be possible to integrate LLaVA-OneVision-1.5 into the official AReaL implementation? That would greatly benefit the community by providing an up-to-date and unified RL training pipeline for this model.
Thanks for your consideration!