Fix evolution limits were not enforced by arximboldi · Pull Request #392 · darkmatter/nixmac

Juanpe Bolívar (arximboldi) · 2026-06-11T01:31:11Z

Summary

I was running some experiments with the eval suite and at some point I noticed Nixmac had gotten stuck in a loop burning tokens like there's no tomorrow. That's because while run_evals.py tries to set the max iterations, this check had been removed in #325, which had landed develop unreviewed.

This reverts that commit, and also makes sure that the alternative suggested in #325 (max_token_budget) is enforced properly, which also wasn't.

I'd kindly ask coopmoney and Scott McMaster (@scottmcmaster) to review carefully in case I made some wrong assumptions here.

Test Plan

No test plan needed

Docs

Docs updated (companion PR in darkmatter/nixmac-web: #___)
No docs update needed

…325)" This reverts commit 9a017e9.

Adds the TokenBudget variant to EvolutionLimitKind (with matching attempts_label/prompt/stop_summary arms and a small format_token_count helper) so the next commit can wire up the loop guard. Also consolidates the max_token_budget read: the evolve loop now destructures it from EvolutionLimits::load alongside max_build_attempts and (post-#325-revert) max_iterations, instead of calling store::get_max_token_budget separately. To keep UI writes via store::set_max_token_budget routed through the same source of truth, the store getter/setter are made Slice-aware (mirroring the existing get_max_iterations / get_max_build_attempts pattern), with a fallback to the legacy tauri-plugin-store path when the Slice isn't registered. No behavior change in the loop yet; enforcement lands in the next commit.

Adds the missing comparison: after each API response, if total_tokens has reached max_token_budget, ask the user whether to continue (interactive) or stop (non-interactive). On continue, extend the budget by the original amount, mirroring the BuildAttempts UX. On stop, hand off to finish_after_limit_stop. PR #325 removed max_iterations on the premise that max_token_budget already enforced a session bound. It didn't — the value was loaded, logged, and emitted to the UI progress bar, but never compared against total_tokens to terminate the loop. This closes that gap. Providers that don't return usage (Ollama, some CLI providers) sidestep this guard entirely; the restored MaxIterations check covers those.

…eached Adds a new EvolutionState variant for runs that finish_after_limit_stop terminates. Before this, hitting any safety guard (NoProgress / MaxIterations / BuildAttempts / TokenBudget) ended the run as Conversational or Generated depending on whether edits had been made, making it impossible for downstream consumers (notably the eval harness) to tell "the agent decided it was done" from "we cut it off". finish_after_limit_stop now sets state to LimitReached unconditionally. TypeScript binding regenerated via specta. Eval harness scoring will need a companion update on the nixmac-web side to grade LimitReached separately.

Two cosmetic gaps left behind by the structural LimitReached work: - cli.rs printed "Evolution completed successfully" on any non- conversational success, including runs the loop cut off. Now matches against the state and prints a stopped-for-safety message in the LimitReached arm. - use-evolve.ts toasted ✓ "Evolution complete" with the success variant for every non-error result. Now branches on the new state: a ⏸ "Evolution stopped (safety limit reached)" toast.info fires for LimitReached so the user can tell their run was cut off without reading the telemetry. The partial change map is still mirrored — limit-reached runs can contain useful edits the user may want to review or follow up on. Adds a use-evolve.test.ts case for the new path.

github-actions · 2026-06-11T01:33:24Z

	Messages
📖	No docs update needed — acknowledged.

📋 PR Overview


Lines changed	479 (+441 / -38)
Files	0 added, 22 modified, 0 deleted
Draft / WIP	no
Has Test Plan	no
No Test Plan Needed	yes
New UI components	no
New Storybook stories	no
New Rust modules	no
New TS source files	no
New tests	no
package.json touched	no
Cargo.toml touched	no
Infra / CI touched	no

🔬 Coverage

Report	Lines	Statements	Functions	Branches
`apps/native/coverage/coverage-summary.json`	25.6%	25.8%	24.3%	18.4%

Generated by 🚫 dangerJS against 0e8b9fc

github-actions · 2026-06-11T01:33:42Z

🎨 Storybook preview

Open Storybook preview

Updated for 0e8b9fc

🧭 Story changes

Compared to develop (snapshot diff at story level):

✏️ Changed stories (2)

❌ Failed snapshots (3)

These stories' HTML snapshots changed. Current renderings (run bun run test:update-snapshots and commit if intended):

Settings/AutoConfigField › Controls

Settings/AutoTuningSection › Evolution Settings

Widget/Steps/SetupStep › Default Config Required

cooper (czxtm) · 2026-06-12T22:22:46Z

/oc fix conflicts

# Conflicts: # apps/native/src-tauri/src/evolve/mod.rs # apps/native/src/components/widget/settings/auto-config-field.stories.tsx # apps/native/src/components/widget/settings/auto-tuning-section.stories.tsx

darkmatter · 2026-06-12T22:28:13Z

Conflicts fixed. develop merged into jp/fix-max-iterations.

Changes:

apps/native/src-tauri/src/evolve/mod.rs: Resolved state load to keep max_iterations & max_token_budget.
apps/native/src/components/widget/settings/auto-config-field.stories.tsx: Updated schema format. Added both fields.
apps/native/src/components/widget/settings/auto-tuning-section.stories.tsx: Added both fields to UI & default values.

Merge commit created. Ready.

opencode session | github run

Juanpe Bolívar (arximboldi) added 5 commits June 11, 2026 02:58

Revert "refactor: remove max_iterations from evolution config and UI (#…

1688b48

…325)" This reverts commit 9a017e9.

Juanpe Bolívar (arximboldi) requested review from cooper (czxtm) and Scott McMaster (scottmcmaster) June 11, 2026 01:31

github-actions Bot deployed to storybook-preview June 11, 2026 01:33 View deployment

github-actions Bot deployed to storybook-preview June 11, 2026 05:59 View deployment

Juanpe Bolívar (arximboldi) force-pushed the jp/fix-max-iterations branch from b758a30 to 522b7f9 Compare June 11, 2026 21:25

github-actions Bot deployed to storybook-preview June 11, 2026 21:28 View deployment

Merge remote-tracking branch 'origin/develop' into jp/fix-max-iterations

0e8b9fc

# Conflicts: # apps/native/src-tauri/src/evolve/mod.rs # apps/native/src/components/widget/settings/auto-config-field.stories.tsx # apps/native/src/components/widget/settings/auto-tuning-section.stories.tsx

github-actions Bot deployed to storybook-preview June 12, 2026 22:30 View deployment

cooper (czxtm) approved these changes Jun 12, 2026

View reviewed changes

cooper (czxtm) enabled auto-merge June 12, 2026 22:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix evolution limits were not enforced#392

Fix evolution limits were not enforced#392
Juanpe Bolívar (arximboldi) wants to merge 6 commits into
developfrom
jp/fix-max-iterations

Juanpe Bolívar (arximboldi) commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

cooper (czxtm) commented Jun 12, 2026

Uh oh!

darkmatter Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Juanpe Bolívar (arximboldi) commented Jun 11, 2026

Summary

Test Plan

Docs

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📋 PR Overview

🔬 Coverage

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎨 Storybook preview

🧭 Story changes

❌ Failed snapshots (3)

Settings/AutoConfigField › Controls

Settings/AutoTuningSection › Evolution Settings

Widget/Steps/SetupStep › Default Config Required

Uh oh!

cooper (czxtm) commented Jun 12, 2026

Uh oh!

darkmatter Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading