|
1 | 1 | # 🚀 OptimRL: Group Relative Policy Optimization |
2 | 2 |
|
| 3 | + |
3 | 4 | OptimRL is a **high-performance reinforcement learning library** that introduces a groundbreaking algorithm, **Group Relative Policy Optimization (GRPO)**. Designed to streamline the training of RL agents, GRPO eliminates the need for a critic network while ensuring robust performance with **group-based advantage estimation** and **KL regularization**. Whether you're building an AI to play games, optimize logistics, or manage resources, OptimRL provides **state-of-the-art efficiency and stability**. |
4 | 5 |
|
5 | | ---- |
| 6 | +## 🏅 Badges |
| 7 | + |
| 8 | + |
| 9 | + |
| 10 | + |
| 11 | + |
| 12 | + |
| 13 | + |
| 14 | + |
| 15 | + |
| 16 | + |
| 17 | + |
| 18 | + |
| 19 | + |
6 | 20 |
|
7 | 21 | ## 🌟 Features |
8 | 22 |
|
@@ -172,17 +186,10 @@ If you use OptimRL in your research, please cite: |
172 | 186 | title={OptimRL: Group Relative Policy Optimization}, |
173 | 187 | author={Your Name}, |
174 | 188 | year={2024}, |
175 | | - url={https://github.com/yourusername/optimrl} |
| 189 | + url={https://github.com/subaashnair/optimrl} |
176 | 190 | } |
177 | 191 | ``` |
178 | 192 |
|
179 | 193 | --- |
180 | 194 |
|
181 | | -## 🏅 Badges |
182 | 195 |
|
183 | | - |
184 | | - |
185 | | - |
186 | | - |
187 | | - |
188 | | - |
|
0 commit comments