fix(core): restore evaluator messageToUser precedence, opt-in canonical tool text#7897
fix(core): restore evaluator messageToUser precedence, opt-in canonical tool text#7897NubsCarson wants to merge 1 commit into
Conversation
…al tool text
The Server Tests upstream regression (planner-loop-user-facing-text →
"does not regress evaluator's explicit messageToUser path") fails on
develop because preferredFinalMessageFromToolOrModel preferred a single
successful tool's userFacingText OVER the evaluator's explicit
messageToUser. Shaw's regression test asserts the opposite: when the
evaluator emits an explicit messageToUser, it wins.
Reconciling both intents without picking one over the other: add an
opt-in flag verifiedUserFacing on ActionResult / PlannerToolResult.
Tools that emit structured outputs where evaluator paraphrase risks
hallucinating values (paths, ids, counts, numeric metrics) set
verifiedUserFacing: true to mark their userFacingText canonical. The
planner-loop then echoes the tool verbatim instead of letting the
evaluator paraphrase it. Without the flag, the evaluator's explicit
messageToUser wins (Shaw's invariant).
Precedence in preferredFinalMessageFromToolOrModel is now:
1. Single successful tool with verifiedUserFacing === true
2. Evaluator/model messageToUser
3. Most recent tool userFacingText (fallback)
4. Caller-provided fallback
Changes:
- packages/core/src/types/components.ts: add verifiedUserFacing to
ActionResult with JSDoc explaining when to opt in.
- packages/core/src/runtime/planner-types.ts: add verifiedUserFacing to
PlannerToolResult with matching contract.
- packages/core/src/runtime/execute-planned-tool-call.ts and
packages/core/src/runtime/planner-loop.ts
(actionResultToPlannerToolResult): propagate the field through both
ActionResult → PlannerToolResult conversion paths.
- packages/core/src/runtime/planner-loop.ts:
- Rename singleSuccessfulUserFacingToolResultText →
singleVerifiedUserFacingToolResultText and require
verifiedUserFacing === true.
- Reorder preferredFinalMessageFromToolOrModel to put verified-tool
first, then evaluator, then fallback chain.
- packages/core/src/__tests__/planner-happy-path.test.ts: the
"prefers a single tool's verified user-facing text over evaluator
paraphrase" test now sets verifiedUserFacing: true (its semantic
intent — "this is canonical structured data the evaluator could
hallucinate") so the canonical-output guarantee still holds.
Verified:
- 1362 tests pass, 11 skipped (full packages/core suite, 165 files)
- bun run lint:check: 12 warnings before == 12 after (no new flags)
- bun run typecheck: clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
LifeOps Benchmark —
|
| @@ -2187,6 +2196,7 @@ function singleSuccessfulUserFacingToolResultText( | |||
| if (toolResultSteps.length !== 1) return undefined; | |||
| const result = toolResultSteps[0]?.result; | |||
| if (result?.success !== true) return undefined; | |||
| if (result.verifiedUserFacing !== true) return undefined; | |||
| const text = result.userFacingText?.trim(); | |||
| return text || undefined; | |||
There was a problem hiding this comment.
Silent no-op when
verifiedUserFacing meets multi-tool trajectories
singleVerifiedUserFacingToolResultText gates on toolResultSteps.length !== 1, where toolResultSteps includes every step that has both toolCall and result — failed steps included. A plan that calls two tools (even if the first failed and only the second succeeded with verifiedUserFacing: true) will silently fall through to the evaluator's messageToUser. Tool authors who set verifiedUserFacing: true expecting their canonical output to survive multi-step plans will see the flag silently ignored with no diagnostic path to discover why. The JSDoc does say "exactly one successful tool", but "exactly one result step" and "exactly one successful result step" differ and the comment doesn't surface that distinction.
LifeOps Benchmark —
|
What
The
planner-loop-user-facing-text → "does not regress evaluator's explicit messageToUser path"test fails ondevelopbecausepreferredFinalMessageFromToolOrModelprefers a single successful tool'suserFacingTextover the evaluator's explicitmessageToUser. The original Shaw test asserts the opposite: when the evaluator emits an explicitmessageToUser, it wins.A later commit (
4ba5130529 fix(core): preserve single-tool user-facing output) inverted the precedence to prevent the evaluator from paraphrasing a tool's structured output and hallucinating values (paths/ids/numeric metrics). That intent is real and worth preserving — its test (planner-happy-path.test.ts → "prefers a single tool's verified user-facing text over evaluator paraphrase") catches a genuine class of bug. So the two tests are direct semantic opposites, both with valid intent.How
Reconciles both by adding a one-field opt-in:
verifiedUserFacing?: booleanonActionResult/PlannerToolResult. Tools whose output is canonical (do-not-paraphrase) — typically structured data the evaluator could hallucinate — set the flag. Without it, the evaluator wins (Shaw's invariant).Precedence in
preferredFinalMessageFromToolOrModelis now:verifiedUserFacing === truemessageToUseruserFacingText(fallback)Changes
packages/core/src/types/components.ts— addverifiedUserFacingtoActionResultwith JSDoc explaining when to opt in.packages/core/src/runtime/planner-types.ts— add matchingverifiedUserFacingtoPlannerToolResult.packages/core/src/runtime/execute-planned-tool-call.tsandpackages/core/src/runtime/planner-loop.ts(actionResultToPlannerToolResult) — propagate the field through bothActionResult → PlannerToolResultconversion paths.packages/core/src/runtime/planner-loop.ts:singleSuccessfulUserFacingToolResultText→singleVerifiedUserFacingToolResultTextand requireverifiedUserFacing === true.preferredFinalMessageFromToolOrModelto put verified-tool first.packages/core/src/__tests__/planner-happy-path.test.ts— the conflicting test now setsverifiedUserFacing: true(matching its stated semantic intent), so the canonical-output guarantee still holds.Verified
bun run test(fullpackages/core): 1362 passed, 11 skipped across 165 test files.bun run lint:check: 12 warnings before == 12 after (zero new flags).bun run typecheck: clean.Impact
Any tool currently relying on the post-
4ba5130529"tool always wins" precedence without settingverifiedUserFacing: truewill now see the evaluator's explicitmessageToUsertake precedence. To restore the old behavior, setverifiedUserFacing: trueon the action handler'sActionResult. The default change matches the long-standing contract inuserFacingText's JSDoc ("the planner-loop's terminal-FINISH fallback may use this") and Shaw's pre-existing regression coverage.🤖 Generated with Claude Code
Greptile Summary
This PR resolves a test conflict between two valid but opposing invariants: Shaw's rule that the evaluator's explicit
messageToUserwins, and the anti-hallucination rule that a tool's structured output should not be paraphrased. The fix introducesverifiedUserFacing?: booleanas an opt-in field onActionResult/PlannerToolResultthat tool authors set when their output is canonical.preferredFinalMessageFromToolOrModelnow uses a three-tier precedence: (1) single verified tool, (2) evaluatormessageToUser, (3) latest tool fallback — restoring the evaluator-wins default while preserving the anti-paraphrase guarantee for opted-in tools.planner-happy-pathtest is fixed by addingverifiedUserFacing: trueto the CHECK_RUNTIME mock, making its intent match its mechanism.actionResultToPlannerToolResultandactionResultToStreamingResultconversion paths.Confidence Score: 4/5
Safe to merge; the behavioral change is intentional and well-documented, and both regression tests pass.
The change correctly restores evaluator precedence as the default and the opt-in
verifiedUserFacingflag works as designed. The single-step constraint insingleVerifiedUserFacingToolResultTextmeans a tool that setsverifiedUserFacing: truein a multi-call trajectory will have its flag silently ignored — a confusing footgun for tool authors, though it is documented in the JSDoc.The precedence logic in
planner-loop.tsaroundsingleVerifiedUserFacingToolResultTextis the most sensitive area, specifically the interaction between the "exactly one result step" constraint and multi-step plans whereverifiedUserFacingis set.Important Files Changed
singleSuccessfulUserFacingToolResultTexttosingleVerifiedUserFacingToolResultText, addsverifiedUserFacinggate, reorderspreferredFinalMessageFromToolOrModelprecedence so evaluator wins by default unless the tool opts-in; both semantics are covered by existing testsverifiedUserFacing?: booleantoPlannerToolResultwith clear JSDoc; no logic changesverifiedUserFacing?: booleantoActionResultwith JSDoc; mirrors the PlannerToolResult addition cleanlyverifiedUserFacingthroughactionResultToStreamingResultto the streaming hook; straightforward one-line additionverifiedUserFacing: trueto the CHECK_RUNTIME mock action to restore the "tool wins" semantic; the test's stated intent now matches its mechanismFlowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[preferredFinalMessageFromToolOrModel] --> B{Single tool step\nwith result?} B -- No --> D B -- Yes --> C{verifiedUserFacing\n=== true?} C -- No --> D{evaluator\nmessageToUser set?} C -- Yes --> G{success === true\nAND userFacingText set?} G -- Yes --> H[Return tool userFacingText\ncanonical / do-not-paraphrase] G -- No --> D D -- Yes --> E[Return evaluator messageToUser\nevaluator wins by default] D -- No --> F{Any tool step has\nuserFacingText?} F -- Yes --> I[Return latest tool\nuserFacingText fallback] F -- No --> J[Return caller fallback]Reviews (1): Last reviewed commit: "fix(core): restore evaluator messageToUs..." | Re-trigger Greptile