You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+68-5Lines changed: 68 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -194,23 +194,70 @@ The generated artifacts flow into the `DynamicDatabase`, which keeps repositorie
194
194
195
195
## Working with Agents and Trainers
196
196
197
-
### Supervised Fine-Tuning (`SFTTrainer`)
197
+
### Agents
198
+
199
+
Agents orchestrate the full workflow of repository setup, training, and theorem proving. Each agent pairs a trainer with a compatible prover.
200
+
201
+
#### `HFAgent`
202
+
203
+
Uses Hugging Face models fine-tuned with `SFTTrainer` or `GRPOTrainer` for theorem proving. Loads checkpoints locally and uses `HFProver` for proof search. Ideal for training custom models on your traced repositories. Does not build Lean dependencies by default.
204
+
205
+
```python
206
+
from lean_dojo_v2.agent.hf_agent import HFAgent
207
+
from lean_dojo_v2.trainer.sft_trainer import SFTTrainer
Uses the Hugging Face Inference API to access large models like DeepSeek-Prover-V2-671B without local model loading. Pairs with `ExternalProver` for whole-proof generation or proof search. Best for quick experiments or when you don't have GPU resources for local inference.
219
+
220
+
```python
221
+
from lean_dojo_v2.agent.external_agent import ExternalAgent
222
+
223
+
agent = ExternalAgent()
224
+
agent.setup_github_repository(url, commit)
225
+
agent.prove()
226
+
```
227
+
228
+
#### `LeanAgent`
229
+
230
+
Implements the lifelong learning pipeline with retrieval-augmented generation. Uses `RetrievalTrainer` to train premise retrievers, then pairs with `RetrievalProver` for retrieval-augmented tactic generation. Maintains repository curricula and builds Lean dependencies by default.
231
+
232
+
```python
233
+
from lean_dojo_v2.agent.lean_agent import LeanAgent
- Produces checkpoints under `output_dir` that the `HFProver` consumes.
203
249
204
-
### GRPO Trainer (`GRPOTrainer`)
250
+
####GRPO Trainer (`GRPOTrainer`)
205
251
206
252
- Implements Group Relative Policy Optimization for reinforcement-style refinement.
207
253
- Accepts `reference_model`, `reward_weights`, and `kl_beta` settings.
208
254
- Useful for improving search policies on curated theorem batches.
209
255
210
-
### Retrieval Trainer & LeanAgent
256
+
####Retrieval Trainer (`RetrievalTrainer`)
211
257
212
-
-`RetrievalTrainer` trains the dense retriever that scores prior proofs.
213
-
-`LeanAgent` wraps the trainer, maintains repository curricula, and couples it with `RetrievalProver`.
258
+
- Trains the dense retriever that scores prior proofs from the corpus.
259
+
- Used by `LeanAgent` to build retrieval-augmented generation models.
260
+
- Requires indexed corpus and generator checkpoints.
214
261
215
262
Each agent inherits `BaseAgent`, so you can implement your own by overriding `_get_build_deps()` and `_setup_prover()` to register new trainer/prover pairs.
216
263
@@ -242,6 +289,22 @@ Each agent inherits `BaseAgent`, so you can implement your own by overriding `_g
242
289
243
290
## Proving Theorems
244
291
292
+
LeanDojo-v2 provides three prover implementations, each for different use cases:
293
+
294
+
### `HFProver`
295
+
296
+
Loads a fine-tuned Hugging Face model from a local checkpoint (supports full models and LoRA adapters) and generates tactics directly, used for locally trained Hugging Face model (e.g. with `SFTTrainer` and `GRPOTrainer`).
297
+
298
+
### `ExternalProver`
299
+
300
+
Performs inference with the Hugging Face Inference API to access large models without local GPU resources. Defaults to DeepSeek-Prover-V2-671B. Supports both proof search and whole-proof generation.
301
+
302
+
### `RetrievalProver`
303
+
304
+
Used directly with LeanAgent.
305
+
306
+
### Proof Methods
307
+
245
308
LeanDojo-v2 supports two methods for theorem proving:
246
309
247
310
-**Whole-proof generation**: generate complete proof in one forward pass of the prover.
0 commit comments