PhD at UChicago, RL for language models, especially Computer Use Agents
Highlights
- Pro
Pinned Loading
-
Gen-Verse/OpenClaw-RL
Gen-Verse/OpenClaw-RL PublicOpenClaw-RL: Personalize openclaw simply by talking to it
-
Gen-Verse/dLLM-RL
Gen-Verse/dLLM-RL Public[ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.
-
Gen-Verse/Open-AgentRL
Gen-Verse/Open-AgentRL PublicAn open-source RL (DemyAgent & RLAnything) for training LLM-based agents — supporting GRPO, PPO, RLHF, multi-turn reasoning, tool use, and distributed training.
-
Gen-Verse/CURE
Gen-Verse/CURE Public[NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
