fix(prompt): Extract display value from prompt result options for grading by mattbrailsford · Pull Request #148 · umbraco/Umbraco.AI

mattbrailsford · 2026-05-05T09:29:05Z

Summary

Fixes [Tests] Prompt test graders evaluate raw JSON output instead of plain text value #142 — prompt test graders evaluated the raw structured-output JSON envelope ({"value":"..."}) instead of the unwrapped text, causing the Regex grader to fail length checks and the LLM Judge to throw The JSON value could not be converted to System.Double when scoring.
PromptTestFeature now overrides ExtractOutputValue to prefer resultOptions[].displayValue from the transcript's FinalOutput, joining multi-option responses with newlines and falling back to content when no options are present (preserves OptionCount == 0 and error transcripts).
Adds PromptTestFeatureTests covering single option, multiple options, no options, missing resultOptions, and error-shape transcripts.

Test plan

dotnet test Umbraco.AI.Prompt/Umbraco.AI.Prompt.slnx — 60/60 pass (5 new)
dotnet test Umbraco.AI/Umbraco.AI.slnx — 714 unit + 25 integration pass
Manual: create a Single Option prompt, run a test with the regex ^[\s\S]{1,160}$ and an LLM Judge, confirm graders evaluate the unwrapped text

🤖 Generated with Claude Code

…ding Prompts with Single Option / Multiple Options return a structured-output JSON envelope on result.Content (e.g. {"value":"..."}). The unwrapped text lives on ResultOptions[].DisplayValue. The default ExtractOutputValue read the raw JSON back out of FinalOutput.content, so graders evaluated the envelope instead of the actual generated text - the regex grader failed on the JSON wrapper, and the LLM judge embedded the JSON in its prompt and produced unparseable scores. PromptTestFeature now overrides ExtractOutputValue to prefer resultOptions[].displayValue (joined with newlines for multi-option), falling back to content when no options are present. Fixes #142 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the anonymous-object writes and the property-name-based reads in PromptTestFeature with a single FinalOutputEnvelope type. Now both ExecuteAsync (write) and ExtractOutputValue (read) reference the same properties, so a future rename can't desynchronise serialisation from extraction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mattbrailsford and others added 2 commits May 5, 2026 10:28

mattbrailsford merged commit 81e5250 into dev May 5, 2026
0 of 4 checks passed

mattbrailsford deleted the feature/fix-prompt-test-grader-output branch May 5, 2026 11:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(prompt): Extract display value from prompt result options for grading#148

fix(prompt): Extract display value from prompt result options for grading#148
mattbrailsford merged 2 commits into
devfrom
feature/fix-prompt-test-grader-output

mattbrailsford commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mattbrailsford commented May 5, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant