Skip to content

Commit 1ec90a6

Browse files
committed
update README
1 parent 96eba1d commit 1ec90a6

File tree

1 file changed

+68
-5
lines changed

1 file changed

+68
-5
lines changed

README.md

Lines changed: 68 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -194,23 +194,70 @@ The generated artifacts flow into the `DynamicDatabase`, which keeps repositorie
194194

195195
## Working with Agents and Trainers
196196

197-
### Supervised Fine-Tuning (`SFTTrainer`)
197+
### Agents
198+
199+
Agents orchestrate the full workflow of repository setup, training, and theorem proving. Each agent pairs a trainer with a compatible prover.
200+
201+
#### `HFAgent`
202+
203+
Uses Hugging Face models fine-tuned with `SFTTrainer` or `GRPOTrainer` for theorem proving. Loads checkpoints locally and uses `HFProver` for proof search. Ideal for training custom models on your traced repositories. Does not build Lean dependencies by default.
204+
205+
```python
206+
from lean_dojo_v2.agent.hf_agent import HFAgent
207+
from lean_dojo_v2.trainer.sft_trainer import SFTTrainer
208+
209+
trainer = SFTTrainer(model_name="deepseek-ai/DeepSeek-Prover-V2-7B", ...)
210+
agent = HFAgent(trainer=trainer)
211+
agent.setup_github_repository(url, commit)
212+
agent.train()
213+
agent.prove()
214+
```
215+
216+
#### `ExternalAgent`
217+
218+
Uses the Hugging Face Inference API to access large models like DeepSeek-Prover-V2-671B without local model loading. Pairs with `ExternalProver` for whole-proof generation or proof search. Best for quick experiments or when you don't have GPU resources for local inference.
219+
220+
```python
221+
from lean_dojo_v2.agent.external_agent import ExternalAgent
222+
223+
agent = ExternalAgent()
224+
agent.setup_github_repository(url, commit)
225+
agent.prove()
226+
```
227+
228+
#### `LeanAgent`
229+
230+
Implements the lifelong learning pipeline with retrieval-augmented generation. Uses `RetrievalTrainer` to train premise retrievers, then pairs with `RetrievalProver` for retrieval-augmented tactic generation. Maintains repository curricula and builds Lean dependencies by default.
231+
232+
```python
233+
from lean_dojo_v2.agent.lean_agent import LeanAgent
234+
235+
agent = LeanAgent()
236+
agent.setup_github_repository(url, commit)
237+
agent.train()
238+
agent.prove()
239+
```
240+
241+
### Trainers
242+
243+
#### Supervised Fine-Tuning (`SFTTrainer`)
198244

199245
- Accepts any Hugging Face causal LM identifier.
200246
- Supports LoRA by passing a `peft.LoraConfig`.
201247
- Key arguments: `epochs_per_repo`, `batch_size`, `max_seq_len`, `lr`, `warmup_steps`, `gradient_checkpointing`.
202248
- Produces checkpoints under `output_dir` that the `HFProver` consumes.
203249

204-
### GRPO Trainer (`GRPOTrainer`)
250+
#### GRPO Trainer (`GRPOTrainer`)
205251

206252
- Implements Group Relative Policy Optimization for reinforcement-style refinement.
207253
- Accepts `reference_model`, `reward_weights`, and `kl_beta` settings.
208254
- Useful for improving search policies on curated theorem batches.
209255

210-
### Retrieval Trainer & LeanAgent
256+
#### Retrieval Trainer (`RetrievalTrainer`)
211257

212-
- `RetrievalTrainer` trains the dense retriever that scores prior proofs.
213-
- `LeanAgent` wraps the trainer, maintains repository curricula, and couples it with `RetrievalProver`.
258+
- Trains the dense retriever that scores prior proofs from the corpus.
259+
- Used by `LeanAgent` to build retrieval-augmented generation models.
260+
- Requires indexed corpus and generator checkpoints.
214261

215262
Each agent inherits `BaseAgent`, so you can implement your own by overriding `_get_build_deps()` and `_setup_prover()` to register new trainer/prover pairs.
216263

@@ -242,6 +289,22 @@ Each agent inherits `BaseAgent`, so you can implement your own by overriding `_g
242289

243290
## Proving Theorems
244291

292+
LeanDojo-v2 provides three prover implementations, each for different use cases:
293+
294+
### `HFProver`
295+
296+
Loads a fine-tuned Hugging Face model from a local checkpoint (supports full models and LoRA adapters) and generates tactics directly, used for locally trained Hugging Face model (e.g. with `SFTTrainer` and `GRPOTrainer`).
297+
298+
### `ExternalProver`
299+
300+
Performs inference with the Hugging Face Inference API to access large models without local GPU resources. Defaults to DeepSeek-Prover-V2-671B. Supports both proof search and whole-proof generation.
301+
302+
### `RetrievalProver`
303+
304+
Used directly with LeanAgent.
305+
306+
### Proof Methods
307+
245308
LeanDojo-v2 supports two methods for theorem proving:
246309

247310
- **Whole-proof generation**: generate complete proof in one forward pass of the prover.

0 commit comments

Comments
 (0)