hammer with given lemmas by Rw1nd · Pull Request #212 · lukaszcz/coqhammer

Rw1nd · 2026-02-09T14:23:19Z

Hello, CoqHammer is an excellent automated theorem proving tool that I have been using on Rocq. However, I notice that the current implementation only selects lemmas through clustering algorithms (by using predict command). This modification extends hammer to support user-specified lemmas.

I add support for an optional syntax to provide specific lemmas to the hammer tactic:

From Hammer Require Import Hammer.
From Stdlib Require Import Lists.List.
Import ListNotations.

Theorem demo: forall (A: Type) (l1 l2 : list A), 
  rev (l1 ++ l2) = rev l2 ++ rev l1.
Proof.
  intros.
  induction l1.
  - hammer [app_nil_r app_assoc_reverse_deprecated].
  - hammer [app_nil_r app_assoc_reverse_deprecated].
Qed.

The new syntax, hammer [lemma1 lemma2 ...]., allows users to explicitly specify which lemmas to use. I think this makes hammer more flexible.

Summary by cubic

Add optional lemma selection to hammer so users can guide premise choice. Default hammer behavior stays the same.

New Features
- New syntax: hammer [lemma1 ...].
- Resolves names from the env; errors if a lemma is missing.
- Extracts deps for the goal and given lemmas, then runs ATPs (incl. greedy sequence). Exposed as Features.choose_given_lemmas.
- Hyps are always passed; minimization still applies. Plain hammer uses prediction.

^{Written for commit 03f915e. Summary will update on new commits.}

cubic-dev-ai

4 issues found across 4 files

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/plugin/features.mli">

<violation number="1" location="src/plugin/features.mli:14">
P3: The extract_given_lemmas return comment says it is a temporary file name, but the function actually returns an hhdef list of predicted dependencies. The interface contract is misleading.</violation>
</file>

<file name="src/plugin/features.ml">

<violation number="1" location="src/plugin/features.ml:230">
P2: extract_given_lemmas can throw a misleading "Predictor did not return advice." error when ths_deps is empty because it reads from an empty temp file. This is likely for empty user-provided lemmas and should return an empty selection instead of crashing.</violation>
</file>

<file name="src/plugin/hammer_main.ml">

<violation number="1" location="src/plugin/hammer_main.ml:263">
P2: Catch-all exception handler in get_argus will swallow Sys.Break interrupts (Ctrl+C), preventing users from interrupting a hanging tactic; re-raise Sys.Break or catch specific exceptions.</violation>

<violation number="2" location="src/plugin/hammer_main.ml:734">
P2: do_extraction duplicates the bulk of do_predict’s parallel execution, cleanup, and minimization logic. This makes maintenance error-prone because changes to prover orchestration must be kept in sync across two functions.</violation>
</file>

Since this is your first cubic review, here's how it works:

cubic automatically reviews your code and comments on bugs and improvements
Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
Ask questions if you need clarification on any suggestion

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

lukaszcz · 2026-02-09T20:34:07Z

+| [ "hammer" "[" constr_list(l) "]" ] -> { hammer_tac l }
 END

+


Avoid whitespace-only changes

I will remove the whitespace-only changes

lukaszcz · 2026-02-09T20:34:19Z

    deps

+
+


Avoid whitespace-only changes

I will remove the whitespace-only changes

lukaszcz · 2026-02-09T20:37:30Z

+        (Str.split (Str.regexp " ")
+           (try input_line ic with End_of_file ->
+             close_in ic; Sys.remove oname;
+             raise (HammerError "Predictor did not return advice.")))


The error message is misleading because there is no prediction going on here (we supply the lemmas, not predict them)

This part of the code is redundant. I will remove it.

lukaszcz · 2026-02-09T20:47:05Z

Thank you for the contribution, but I don't quite see the point. If we know which lemmas we want to use, we can just use the best/sauto tactic with these lemmas. The hammer tactic first runs best (essentially), and if that fails it runs ATPs to select relevant lemmas which it then uses with best/sauto (essentially) to prove the theorem inside Rocq. If the user provides the lemmas they want to use up-front, there is not much point in passing them over to the ATPs, so one could as well use best. The only use-case seems to be if you want to minimize the number of dependencies from a small user-provided set, but then one could just write a function to run best for all possible subsets if the number of supplied lemmas is small.

Rw1nd · 2026-02-10T06:52:00Z

Hello, my idea is to make hammer easier to integrate with other automated tools. For example, if a tool implements a new lemma retrieval algorithm, this syntax would allow integration with hammer without requiring modifications to hammer's internal algorithms. This new syntax could serve as a general-purpose interface between Rocq and ATPs and is better support for non-interactive, programmatic usage.

lukaszcz · 2026-02-10T13:44:43Z

Hello, my idea is to make hammer easier to integrate with other automated tools. For example, if a tool implements a new lemma retrieval algorithm, this syntax would allow integration with hammer without requiring modifications to hammer's internal algorithms. This new syntax could serve as a general-purpose interface between Rocq and ATPs and is better support for non-interactive, programmatic usage.

Yes, this makes sense if by lemma retrieval you mean quickly preselecting a large number of possibly relevant lemmas. The whole ATP backend of Hammer is nothing more than a lemma retrieval algorithm itself. So by providing lemmas to the hammer tactic all you're doing is feeding them into another (possibly more precise) lemma retrieval tool.

The only thing that ATPs do in a hammer is relevant lemma selection. The proof found by the ATPs is discarded.

If you don't need further relevant lemma selection, you should feed your retrieved lemmas to best which is the actual in-Rocq prover that hammer uses after selecting relevant lemmas.

If you have a tool that pre-selects a large number of lemmas (essentially a replacement for hammer's predict), then it may make sense to provide the lemmas to hammer. A preferred way to do this would be to replace the predict program, but this indeed may not always be easy or feasible.

If your lemma retrieval tool is precise enough to select e.g. around 10 relevant lemmas, then there is little point in feeding them further to the ATPs - you should feed them to best instead. If your lemma retrieval tool returns e.g. around 100 possibly relevant lemmas, then it should be implemented as a replacement for predict whenever possible - the in-Rocq interface from this PR makes sense only when this is too hard.

I think it makes sense to include this PR, but the use-cases seem to be rare. Also, the issues with the PR code need to be addressed.

lukaszcz · 2026-02-10T13:54:15Z


 TACTIC EXTEND Hammer_tac
-| [ "hammer" ] -> { hammer_tac () }
+| [ "hammer" ] -> { hammer_tac [] }


Don't use empty list to indicate no extraction. Create a datatype with constructors for extraction and prediction.

Also, the "lemma extraction" should be called something else to avoid confusion with feature extraction. Perhaps "lemma choice"?

Yes, "lemma choice" is better.

I create a datatype hammer_mode for prediction and choice:

type hammer_mode = | Prediction | Choice of EConstr.t list

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

lukaszcz · 2026-02-10T21:15:25Z

+  let goal_deps = List.filter (fun a -> Hhlib.StringSet.mem a names) goal_deps in
+  let ths_deps = List.sort_uniq Stdlib.compare (goal_deps @ (List.concat (List.map write_def lems))) in 
+
+  let oname = Filename.temp_file ("coqhammer_out" ^ atpname) "" in


I don't understand why we're writing to files here at all. We should just convert the lemmas to hhdefs without going through the type system. The only reason extract creates files is to interface with the separate predict program.

This is intended to reuse part of the previous code. Indeed, writing to a file is unnecessary here. I will remove this redundant code.

lukaszcz · 2026-02-10T21:19:18Z

+  if !Opt.debug_mode then
+    Msg.info ("After filtering: " ^ string_of_int (List.length defss) ^ " Coq objects.");
+  let names = Hhlib.strset_from_lst (List.map get_hhdef_name defss) in
+  let write_def def =


write_def is a confusing name for a function that doesn't do any IO

Yes, maybe choose_def is better.

lukaszcz · 2026-02-10T21:22:16Z

  fname

+let extract_given_lemmas (hyps : hhdef list) (defs : hhdef list) (lems : hhdef list) (goal : hhdef) (atpname : string) =
+  Msg.info "Extracting definitions...";


"extract" and "extracting" is poor naming, because it suggest connection to feature extraction and we're not doing that here. Perhaps "choice" and "choosing"?. Maybe this function should not be in the features.ml file, but a separate one?

Yes, choose_given_lemmas might be a better name. I think we don't need a separate file, as this functionality is quite similar to the extract function in features.ml. Although we don't need to use features for prediction, the functions used are very similar. Having a separate file just for a single function seems a bit redundant.

Rw1nd · 2026-02-11T03:13:56Z

Thank you for your suggestion. I will address the issues in the code.

cubic-dev-ai

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/plugin/features.ml">

<violation number="1" location="src/plugin/features.ml:200">
P2: Empty lemma lists are now treated as an error, but review feedback requires graceful handling of empty lists. This rejects valid inputs like `hammer []` and violates the stated requirement.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-02-11T06:54:34Z

+  if lems = [] then
+     raise (HammerError ("No lemmas given for choice."));


P2: Empty lemma lists are now treated as an error, but review feedback requires graceful handling of empty lists. This rejects valid inputs like hammer [] and violates the stated requirement.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At src/plugin/features.ml, line 200: <comment>Empty lemma lists are now treated as an error, but review feedback requires graceful handling of empty lists. This rejects valid inputs like `hammer []` and violates the stated requirement.</comment> <file context> @@ -196,46 +194,27 @@ let extract (hyps : hhdef list) (defs : hhdef list) (goal : hhdef) : string = +let choose_given_lemmas (hyps : hhdef list) (defs : hhdef list) (lems : hhdef list) (goal : hhdef) (atpname : string) = + Msg.info "Choosing definitions..."; let defss = List.filter is_nontrivial defs in + if lems = [] then + raise (HammerError ("No lemmas given for choice.")); if !Opt.debug_mode then </file context>

Suggested change

if lems = [] then

raise (HammerError ("No lemmas given for choice."));

if lems = [] then

Msg.info "No lemmas given for choice.";

hammer with given lemmas

0fe2a62

cubic-dev-ai Bot reviewed Feb 9, 2026

View reviewed changes

Comment thread src/plugin/features.ml Outdated

Comment thread src/plugin/hammer_main.ml Outdated

Comment thread src/plugin/hammer_main.ml Outdated

Comment thread src/plugin/features.mli Outdated

Update features.mli

6359797

lukaszcz reviewed Feb 9, 2026

View reviewed changes

lukaszcz reviewed Feb 10, 2026

View reviewed changes

Apply suggestion from @cubic-dev-ai[bot]

91bd38e

Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

lukaszcz reviewed Feb 10, 2026

View reviewed changes

Update

03f915e

cubic-dev-ai Bot reviewed Feb 11, 2026

View reviewed changes

lukaszcz self-requested a review February 17, 2026 22:36

		\| [ "hammer" "[" constr_list(l) "]" ] -> { hammer_tac l }
		END

		if lems = [] then
		raise (HammerError ("No lemmas given for choice."));

Conversation

Rw1nd commented Feb 9, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukaszcz commented Feb 9, 2026

Uh oh!

Rw1nd commented Feb 10, 2026

Uh oh!

lukaszcz commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Rw1nd commented Feb 11, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Rw1nd commented Feb 9, 2026 •

edited by cubic-dev-ai Bot

Loading

lukaszcz commented Feb 10, 2026 •

edited

Loading

cubic-dev-ai Bot Feb 11, 2026 •

edited

Loading