Skip to content

Conversation

@russell-rozenbaum
Copy link
Contributor

@russell-rozenbaum russell-rozenbaum commented Nov 29, 2025

Project Mode with Coding Agent

  • Resolve merge conflicts

Main Tasks for Coding Agent

Make this branch merge-and-test-ready (Russ)

  • Add ability for agent to write tests for its own code
  • Write backend unit tests in hazel repo itself (for testing the implementations here)
  • Allow for agent to write probes
  • Fix variable renaming
  • Hitting an exception when agent tries to rename variable "variable 'x' not found in node map"... this might mean something is stale
  • Add ability to scroll back and forth through agent edits
  • Tab to view actions/documentation for actions
  • Add check for concave grout after edit actions
  • Improve header readability and descriptiveness of tool calls
  • General prompt improvement

Complete Evaluation Framework (Cyrus)

  • Flesh out metadata tracking in agent responses, tool calls, chats, etc.
  • Write Evaluation Test Cases. This includes mainly three components: 1. Starting/Initial Code 2. Tasks to Complete Given the Starting Code 3. Write Unit Tests for Evaluation of these Tests 4. (Optional?) Grading Rubric to Grade Generated Code Itself (via grading agent)

Misc.

  • Make file system functional via connecting file names/headers
  • Add a step through feature for the agent chat/edit history

@codecov
Copy link

codecov bot commented Dec 3, 2025

Codecov Report

❌ Patch coverage is 32.53012% with 616 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.17%. Comparing base (d601b04) to head (aaf42e3).
⚠️ Report is 722 commits behind head on dev.

Files with missing lines Patch % Lines
src/haz3lcore/CompositionCore/OpenRouter.re 0.00% 200 Missing ⚠️
src/haz3lcore/CompositionCore/HighLevelNodeMap.re 46.64% 151 Missing ⚠️
src/haz3lcore/CompositionCore/CompositionGo.re 55.19% 82 Missing ⚠️
src/haz3lcore/CompositionCore/GeneralTreeUtils.re 44.59% 41 Missing ⚠️
src/util/API.re 0.00% 30 Missing ⚠️
src/util/StringUtil.re 0.00% 30 Missing ⚠️
...web/view/ProjectModeCore/AgentCore/AgentGlobals.re 3.57% 27 Missing ⚠️
src/haz3lcore/zipper/action/Action.re 0.00% 25 Missing ⚠️
...haz3lcore/CompositionCore/HighLevelNodeMapModel.re 0.00% 9 Missing ⚠️
src/haz3lcore/tiles/Base.re 46.15% 7 Missing ⚠️
... and 5 more
Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #2048      +/-   ##
==========================================
- Coverage   51.43%   50.17%   -1.26%     
==========================================
  Files         205      212       +7     
  Lines       21754    22832    +1078     
==========================================
+ Hits        11189    11457     +268     
- Misses      10565    11375     +810     
Files with missing lines Coverage Δ
src/haz3lcore/zipper/Printer.re 95.34% <100.00%> (+0.11%) ⬆️
src/haz3lcore/zipper/action/Perform.re 34.37% <100.00%> (+4.37%) ⬆️
src/language/proof/ProofCtx.re 11.11% <ø> (ø)
src/language/statics/StaticsBase.re 85.71% <ø> (ø)
src/language/term/Exp.re 100.00% <ø> (ø)
src/web/app/common/Icons.re 100.00% <100.00%> (ø)
src/web/debug/DebugConsole.re 0.00% <ø> (ø)
src/web/init/Init.re 57.14% <ø> (ø)
src/web/Store.re 9.37% <0.00%> (-0.31%) ⬇️
src/web/app/editors/code/CodeEditable.re 0.00% <0.00%> (ø)
... and 13 more

... and 23 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cyrus- cyrus- mentioned this pull request Jan 12, 2026
8 tasks
@CyrusD123 CyrusD123 self-assigned this Feb 6, 2026
@russell-rozenbaum russell-rozenbaum marked this pull request as ready for review February 7, 2026 02:34
Copy link
Member

@disconcision disconcision left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a basic review of the integration layer. I think we should make some changes but probably nothing too substantial i dont think. Short notice for this afternoon's meeting; if you have a chance to glance over we can discuss, otherwise can discuss later in the week or next if you are available. some comments are just questions i'm not totally sure about it, mostly about functionality that has moved.

A few things to break out:

  • I think the internal printer changes you made are probably both unnecessary and may have broken the probe tests; or at least something did. lmk if you don't know what my in-file comments about this mean.
  • I didn't look at the agent-specific stuff in depth but Agent.re is 1723 lines, effectively a god module containing Model + Update + Store for the agent, chat system, and messages all in one file. Should be split soon if not now
  • OLD_ASSISTANT/ directory — 2048 lines of fully commented-out dead code
  • 5 accidentally committed JSON chat logs (NewChat_openrouter_*.json) scattered in source directories — debug artifacts, not test fixtures (maybe add to .gitignore if these are getting generated automatically?)
  • HighLevelNodeMap is 786 lines with complex diff logic — could use documentation (not necessary immediately tho)
  • CSS files are large (~1272 lines for chat messages) — likely fine for now but suggest component-scoped styles later
  • empty files committed (Test_Info.re, Test_MatchExp.re, Test_Stepper.re, DebugConsole.re), just mode changes, no content

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this folder 'Desktop/Programmierung'? there are a few files from it

(
~holes=" ",
~concave_holes=" ",
~special_folds=false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these for the code maps? can we see if it's possible to add these by putting folds directly on the segment structure before sending it to the printer to avoid a special case at this level? if it's easier to put them on the term instead this should now be possible, as maketerm and exp_to_segment now retain whitespace/comments with appropriate settings. lmk if there's complexity here.

(also nbd but ideally avoid re-ordering existing params like projector_to_segment to simplify merge conflicts)

~refractor_seg_to_seg,
~projector_to_segment,
~special_folds,
~refractors: list((Id.t, _))=[],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you say why you added a default here? piece_to_string is an internal function that we should be cautious about calling independently; probably better to be explicit about its arguments. ideally we avoid having to call it at all but sometimes special cases must be made. open to discussion here.

~projector_to_segment,
projector_to_segment(p),
)
if (special_folds) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah okay i get this now... nevermind what i said earlier about putting folds on through seg/term. but instead i think what you're doing here can be done by wrapping projector_to_segment and special-casing what happens with folds. then just call the regular printer with that special projector_to_segment; i don't think any modification to the printer here is necessary. the special projector_to_segment should basically replace projectors which single convex tiles whose label is ["⋱"]; lmk if you don't get what i mean


// AddToolLabel_1.0: Make the action types (above) and add their cases to the funs (below)
[@deriving (show({with_path: false}), sexp, yojson, eq)]
type agent_editor_action =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably call these something like structural_actions instead as in principle they are not agent specific

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this suspiciously named file?

@@ -286,7 +286,7 @@
justify-content: center;
}

#top-bar #editor-mode .icon:hover {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this removed? not necessarily bad but I wonder if it might have adverse consequences, ie this now adds a hover effect to icons not within the editor-mode element, which is not obviously desirable

NinjaKeys.initialize(Shortcut.options(schedule_action));
JsUtil.focus_clipboard_shim();
/* Setup scroll listener for floating elements (backpack) */
FloatingElement.setup_scroll_listener();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this should be removed; this will break the backpack z-index i think

}
| AgentGlobals(agent_globals_action) =>
// AgentGlobals updates are handled at Page level with proper async scheduling
// This case should not be reached, but we include it for completeness
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that this mandates a significant chance to the settings API (including the scheduling callback), we probably shouldn't have this if it isn't reached?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more mode changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants