The OpenThoughts-Agent project has been running RL training with SkyRL-train and Harbor for a while.
The integration of Harbor+SkyRL-train allows users to do RL training for terminal-use style tasks by just focusing on the data.
See the initial release: https://www.openthoughts.ai/blog/agent
For that project, all the code resided in a fork (so that we could make project-specific hot fixes): https://github.com/mlfoundations/SkyRL
This issue tracks the upstreaming of those changes to the main branch of SkyRL, which will be much more robust than what is currently there on the main branch.