OpenKimi

OpenKimi is a research project that implements the reinforcement learning (RL) algorithms and efficient rollout system used in Kimi-K2 and Kimi-K1.5 by @MoonshotAI. Fascinated by the strong performance of Kimi-series models, this project provides theoretical understanding of Policy Mirror Descent (PMD) algorithms that differ from other policy gradient methods like GRPO, along with practical training recipes that achieve superior performance and time efficiency for various downstream tasks. OpenKimi enables both algorithmic exploration for RL fine-tuning and system efficiency research for accelerating rollout in asynchronous RL.

Quick Start

Preparation

git clone --recurse-submodules https://github.com/horizon-rl/OpenKimi.git
cd OpenKimi && cd verl && pip install -e .

Training Examples

Math Reasoning

We provide training example scripts for both FSDP (for smaller models) and Megatron (for larger models and MoE) backends. More details are available in openkimi/pmd/README.md.

bash examples/math/run_pmd_dapo17k_qwen25-7b.sh

Upcoming Features

We are actively developing additional features and training recipes:

System Enhancements

Hybrid Partial Rollout
Muon Optimizer

Recipes

Advanced Mismatch Correction
Tool-Integrated Reasoning
Kimi-K2 Agentic RL Training

Contributing

Contributions are welcome! If you have suggestions or feedback on new features and recipes, feel free to submit an issue or pull request.

pre-commit install

pre-commit run --all-files --show-diff-on-failure --color=always

📖 Citation

@misc{xu2026pmdkimi,
      title={Approximation of Log-Partition Function in Policy Mirror Descent Induces Implicit Regularization for LLM Post-Training}, 
      author={Zhenghao Xu, Qin Lu, Changlong Yu and Tuo Zhao},
      year={2026},
      eprint={2602.05933},
      archivePrefix={arXiv},
      primaryClass={cs.ML},
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
assets		assets
examples/math		examples/math
openkimi		openkimi
verl @ 2af5c1e		verl @ 2af5c1e
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenKimi

Quick Start

Preparation

Training Examples

Math Reasoning

Upcoming Features

Contributing

📖 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

horizon-rl/OpenKimi

Folders and files

Latest commit

History

Repository files navigation

OpenKimi

Quick Start

Preparation

Training Examples

Math Reasoning

Upcoming Features

Contributing

📖 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages