You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please refer to our [installation guide](https://microsoft.github.io/agent-lightning/stable/tutorials/installation/) for more details.
37
+
38
+
To start using Agent-lightning, check out our [documentation](https://microsoft.github.io/agent-lightning/) and [examples](./examples).
39
+
30
40
## ⚡ Articles
31
41
32
42
- 8/11/2025 [Training AI Agents to Write and Self-correct SQL with Reinforcement Learning](https://medium.com/@yugez/training-ai-agents-to-write-and-self-correct-sql-with-reinforcement-learning-571ed31281ad) Medium.
@@ -39,14 +49,6 @@ Read more on our [documentation website](https://microsoft.github.io/agent-light
39
49
-[DeepWerewolf](https://github.com/af-74413592/DeepWerewolf) — A case study of agent RL training for the Chinese Werewolf game built with AgentScope and Agent Lightning.
40
50
-[AgentFlow](https://agentflow.stanford.edu/) — A modular multi-agent framework that combines planner, executor, verifier, and generator agents with the Flow-GRPO algorithm to tackle long-horizon, sparse-reward tasks.
41
51
42
-
## ⚡ Installation
43
-
44
-
```bash
45
-
pip install agentlightning
46
-
```
47
-
48
-
Please refer to our [installation guide](https://microsoft.github.io/agent-lightning/stable/tutorials/installation/) for more details.
49
-
50
52
## ⚡ Architecture
51
53
52
54
Agent Lightning keeps the moving parts to a minimum so you can focus on your idea, not the plumbing. Your agent continues to run as usual; you can still use any agent framework you like; you drop in the lightweight `agl.emit_xxx()` helper, or let the tracer collect every prompt, tool call, and reward. Those events become structured spans that flow into the LightningStore, a central hub that keeps tasks, resources, and traces in sync.
@@ -59,6 +61,16 @@ No rewrites, no lock-in, just a clear path from first rollout to steady improvem
|[calc_x](./calc_x)| VERL-powered math reasoning agent training that uses AutoGen with an MCP calculator tool. |[](https://github.com/microsoft/agent-lightning/actions/workflows/badge-calc-x.yml)|
9
+
|[rag](./rag)| Retrieval-Augmented Generation pipeline targeting the MuSiQue dataset with Wikipedia retrieval. |**Unmaintained** — last verified with Agent-lightning v0.1.1 |
10
+
|[search_r1](./search_r1)| Framework-free Search-R1 reinforcement learning training workflow with a retrieval backend. |**Unmaintained** — last verified with Agent-lightning v0.1.2 |
11
+
|[spider](./spider)| Text-to-SQL reinforcement learning training on the Spider dataset using LangGraph. |[](https://github.com/microsoft/agent-lightning/actions/workflows/badge-spider.yml)|
12
+
|[unsloth](./unsloth)| Supervised fine-tuning example powered by Unsloth with 4-bit quantization and LoRA. |[](https://github.com/microsoft/agent-lightning/actions/workflows/badge-unsloth.yml)|
6
13
7
-
1.[calc_x](examples/calc_x): An agent built with AutoGen with calculator tool use, trained on Calc-X dataset with Reinforcement Learning.
8
-
2.[spider](examples/spider): A write-check-rewrite looped agent with LangGraph with SQL execution; selectively optimize write and rewrite on Spider dataset with Reinforcement Learning.
9
-
3.[apo](examples/apo): An example to customize an optimization algorithm: Automatic Prompt Optimization.
14
+
*NOTE: CI status avoid taking any workflow running with latest dependencies into account. That's why we reference the corresponding `badge-*` workflows instead. Each example's own README also displays its `examples-*` workflow status whenever the project is maintained by CI.*
Copy file name to clipboardExpand all lines: examples/apo/README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,7 @@
1
1
# APO Example
2
2
3
+
[](https://github.com/microsoft/agent-lightning/actions/workflows/examples-apo.yml)
4
+
3
5
This example folder contains three complementary tutorials that demonstrate different aspects of Agent-Lightning. It's compatible with Agent-lightning v0.2 or later.
Copy file name to clipboardExpand all lines: examples/calc_x/README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,7 @@
1
1
# Calc-X Example
2
2
3
+
[](https://github.com/microsoft/agent-lightning/actions/workflows/examples-calc-x.yml)
4
+
3
5
This example demonstrates training a mathematical reasoning agent using Agent-Lightning with the VERL algorithm and AutoGen framework. The agent solves math problems using a calculator tool through the Model Context Protocol (MCP). It's compatible with Agent-lightning v0.2 or later.
Copy file name to clipboardExpand all lines: examples/spider/README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,7 @@
1
1
# Spider Example
2
2
3
+
[](https://github.com/microsoft/agent-lightning/actions/workflows/examples-spider.yml)
4
+
3
5
This example demonstrates how to train a text-to-SQL agent on the Spider dataset using Agent-Lightning with reinforcement learning. It's compatible with Agent-lightning v0.2 or later.
Copy file name to clipboardExpand all lines: examples/unsloth/README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,7 @@
1
1
# Unsloth SFT Example
2
2
3
+
[](https://github.com/microsoft/agent-lightning/actions/workflows/examples-unsloth.yml)
4
+
3
5
This example demonstrates Supervised Fine-Tuning (SFT) using the Unsloth library for efficient training with 4-bit quantization and LoRA. The example trains a math-solving agent on the GSM-hard dataset. It's compatible with Agent-lightning v0.2 or later.
0 commit comments