Conversation
fix: Fix database summary when patch_name metadata is missing
Fix packaging so pip install ships full shinka module tree
… of expected 2 (end and start) markers
fix `apply_full.py` when the patch has incomplete markers
Doc explaining how to add suport for a local LLM and embedding model
…rust Add rust to supported languages
docs: change repo name on the onboarding doc
add google gemini embeding model
Enhance docs, robustify wrap_eval, Visualization w/o API key
update gemini embedding price
add gemini-3-flash-preview
Add GPT 5.2
… various LLM families
|
Hi @Racemuis, Thank you so much for this! Really really exciting. I am personally not a fan of 'stuffing' too much programming language-specific support into the core ShinkaEvolve codebase. But rather have this handled by the user itself (or within an example sub-directory). What I mean by that is not the necessary markers/general syntax utilities but more the evaluation framework itself, e.g., the main.lean --> evaluate.py --> eval with utils_lean [external to shinka] --> return metrics.json and correct.jsonSo the entry point would always be the python Cheers, |
|
Hi @RobertTLange, Thank you for your kind reply and helpful comments! I refactored the PR by moving the supporting files to the I aimed to make reduce all changes made in ShinkaEvolve's core: I removed the dependency on the All changes made are tested using the OpenAI API, and the evolution and evaluation completed without errors. Let me know whether there are any other changes I can make! I am happy to incorporate anything. |
Summary
This pull request extends the ShinkaEvolve framework with the automatic formalization and validation in Lean. The implementation is based on the LeanInteract Python package, which interacts with Lean 4 through the Lean REPL.
The generated formalizations are validated by checking the resulting Lean proofs on correctness and completeness.
Motivation
Being a functional programming language and (interactive) theorem prover, Lean allows to express scientific ideas a explicit and verifiable way without requiring the use of higher-level programming languages like Python.
Key Changes
Lean 4 edit support
/shinka/edit/apply_diff.py./shinka/edit/apply_full.py./shinka/edit/async_apply.py.Core
/shinka/core/runner.py./examples/autoformalization/initial.lean- an example of an initial Lean program.lean-interactto the project's dependencies inpyproject.toml.Theorem proving and validation
/utils/utils_lean.py- containing all utilities for proof generation and validation through the Lean REPL.--prover_modelflag toeval_hydra.py.prover_modelflag in/shinka/launch/scheduler.py./examples/autoformalization/evaluate.py- a simple example of evaluating a Lean program..leanoption and automated proof generation to/shinka/core/wrap_eval.py.Usage examples
Installation & Quick start
The evolution of Lean programs follows the principles of the existing ShinkaEvolve framework. However, a Lean 4 installation is required to automatically validate the generated programs.
Tip
You can install Lean using the
install-leancommand from LeanInteract after cloning the ShinkaEvolve repository and installing all dependencies.Configuration
The changes made include an LLM-based
prover_modelfor the automated completion of Lean proofs from the generated formalizations. This model is specified in/configs/evolution/*.yaml. Even though the default model is set togpt-5-nano, it is recommended to use a dedicated prover model likedeepseek-ai/DeepSeek-Prover-V2-7B./configs/evolution/*_budget.yaml:All other configurations options remain the same.
Note
Some prover models require local hosting. Check out the official guide for setting up local LLM support on directions how to implement this.
Testing
Scope
This PR includes changes to support the evolution of programs in Lean 4. It does not feature any changes that support other things, such as hosting local models.