I've been fixed several bugs and testing the archon lora backend, please refer to this pr: https://github.com/inclusionAI/AReaL/pull/1015 For testing case in qwen 1.5b distill w/ dapo-math-17k datasets: https://github.com/inclusionAI/AReaL/pull/1015#issuecomment-4063437358