Schema-aware property tooling for AI agents (CMS 17.4.0) by mattbrailsford · Pull Request #153 · umbraco/Umbraco.AI

mattbrailsford · 2026-05-06T09:37:49Z

Summary

Bumps the Umbraco CMS dependency floor to 17.4.0-rc2 (.NET + npm) and consumes the new IValueSchemaProvider / IPropertyEditorSchemaService APIs so AI agents have a programmatic source of truth for the value shape every property editor accepts on write.
Threads the resulting JSON Schema through the agent tooling that LLMs use to plan and execute property writes — embedded per-property in get_content_type_schema, available standalone via get_property_value_schema, and inlined into the Entity Context block of the system prompt so the LLM sees the write-shape next to the rendered current value without an extra tool call.
Adds a shallow x-allowedElementTypes enrichment to block-list / block-grid / rich-text-with-blocks schemas (the CMS only emits element-type GUIDs, no aliases) plus a guidance note pointing the LLM at get_content_type_schema for follow-up element-type schemas. Closes the LLM hallucination of block names.
Tightens the set_value frontend tool description to be explicit about replace semantics, the read-merge-write requirement for collections, and the schema-first workflow for non-string editors.
Hardens the AG-UI streaming layer so failures actually surface: tool exceptions become structured ToolInvocationError results instead of opaque [unknown:ErrorContent], MEAI ErrorContent (provider rate-limits, content filters) is logged and rendered inline, and unhandled AIContent subtypes log at debug level so nothing else can disappear silently.
Adds Information-level diagnostic logging at every hop in the tool-call streaming pipeline (AIToolReorderingChatClient, AGUIStreamingService, AIFrontendToolFunction) so the next time a tool call goes missing the cause is one log filter away.

Out of scope / known limitations

The schema-discovery and write-friendly tooling makes the LLM able to author block-list / block-grid values, but not always willing — Anthropic Sonnet 4.6 in particular sometimes substitutes narration ("I'll add the block...") for a tool_use emission when the structured argument is large. The pipeline routes the call correctly when it does emit one (verified by the new diagnostics on a Sonnet vs. GPT cross-check). This is a tool-granularity issue, not a pipeline bug: set_value asks the model to author the full layout/contentData/expose envelope with cross-referenced GUIDs, which models are genuinely bad at. Follow-up issue: a family of frontend domain-operation tools (add_block, remove_block, set_block_property, etc.) that take simpler inputs and do the envelope assembly client-side. Not in this PR.

Detail by area

Dependency bump

Directory.Packages.props: every Umbraco.Cms.* floor bumped from [17.3.0, 17.999.999) to [17.4.0-rc2, 17.999.999). RC version pinning is required because npm/NuGet semver excludes prereleases from stable lower bounds. Microsoft.Extensions.Caching.Memory and Microsoft.Extensions.Options floors raised to 10.0.6 to satisfy the new CMS transitive requirement.
package.json files (root + 5 client workspaces): @umbraco-cms/backoffice peer bumped to ^17.4.0-rc2. diff bumped to ^9.0.0 to satisfy the new backoffice peer; @types/diff removed (diff@9 ships its own types).
Provider packages.lock.json files updated to reflect transitive bumps.
When 17.4.0 stable releases the floor should be relaxed back to [17.4.0, 17.999.999) / ^17.4.0.

Schema-aware tools (`Umbraco.AI/src/Umbraco.AI.Core/Tools/Umbraco/`)

GetContentTypeSchemaTool accepts both alias and GUID, falls through Content → Element → Media → Member buckets, and survives the cache's not-registered throws. Each property in the response now carries its DataTypeKey (Guid) and ValueSchema (JsonObject) — schemas resolved in parallel via IPropertyEditorSchemaService.GetSchemaAsync.
GetPropertyValueSchemaTool — new tool for focused single-data-type schema lookups. Friendly messages for DataTypeNotFound and SchemaNotSupported operation statuses.
BlockSchemaEnricher — walks any schema the schema service produced and attaches x-allowedElementTypes ({ key, alias } per allowed element type) and x-allowedElementTypesNote next to every contentTypeKey.enum. Shallow / lazy by design — the LLM follows up via get_content_type_schema(elementTypeKey) when it actually needs property schemas, mirroring the CMS's own $ref lazy-resolution philosophy. Defensive deep-clone before mutating prevents JsonNode parent-tracking serialisation failures.

Entity Context schema embedding

CmsEntityFormatHelper.FormatCmsEntity now accepts the published-content-type cache + schema service and inlines per-property editor, current value, and input shape (JSON Schema) into the Entity Context system prompt. Document/Media/MemberEntityAdapter inject the services from DI. Existing fall-back rendering preserved when services are absent.

`set_value` description

Drops the "Only supports TextBox and TextArea" gate. Calls out the schema-first workflow for non-string editors and the read-merge-write workflow for collections (block list, block grid, multi-node tree picker, multi-url picker, multi-image media picker, tags). Tighter shape after experimentation showed Sonnet treats long prescriptive text as workflow documentation.

Failure visibility

AIToolFunction<TArgs>.InvokeCoreAsync wraps tool execution in try/catch that logs (Umbraco.AI.Tools.<toolName>) and returns a ToolInvocationError { success, toolName, errorType, message } record instead of bubbling. Replaces the opaque MEAI default of [unknown:ErrorContent] with no body.
AGUIStreamingService.StreamCoreAsync handles MEAI ErrorContent (logs + emits inline as a [Provider error <code>: <message>] text chunk so the run continues per MEAI docs), explicitly no-ops TextContent to avoid double-emit with update.Text, and logs unhandled content types at debug.
New diagnostic logs at AIToolReorderingChatClient (buffered tool names + registered frontend tools), AGUIStreamingService (every FunctionCallContent received), and AIFrontendToolFunction (when the frontend-tool function fires + a warning when FunctionInvokingChatClient.CurrentContext is null).

Test plan

CI green
Manual smoke on the demo site against a doc type with a block list / block grid property:
- get_content_type_schema returns each property's ValueSchema and embeds x-allowedElementTypes on block properties
- Entity Context system prompt shows editor / current value / input shape per property
- LLM (GPT) authors and stages a block-list write through set_value without manually reading the schema
- Negative path: an editor without IValueSchemaProvider still renders correctly with ValueSchema: null
When stable 17.4.0 ships, relax floors per the comment in Directory.Packages.props

Commits

feat(core,copilot,deps): schema-aware property tooling
fix(core): GUID input + throwing-cache resilience on get_content_type_schema
feat(core): block list/grid x-allowedElementTypes enrichment (initially full)
fix(core): deep-clone inner element-type schemas before attaching
refactor(core): shallow + lazy enrichment (drops recursive property schemas)
fix(copilot): replace + read-merge requirements on set_value
fix(core): surface tool exceptions instead of [unknown:ErrorContent]
fix(agent): surface MEAI ErrorContent in AG-UI stream
fix(copilot): tighten set_value description for Sonnet
chore(agent): diagnostic logging at every hop in the tool-call streaming pipeline

Bumps the CMS dependency floor to 17.4.0-rc2 and uses the new IValueSchemaProvider / IPropertyEditorSchemaService APIs so AI agents can produce correctly-shaped values for any property editor — including block list, block grid, and media picker — without guessing. Backend (Umbraco.AI core): - Bump every Umbraco.Cms.* PackageVersion floor from [17.3.0, ...) to [17.4.0-rc2, ...). Bump Microsoft.Extensions.Caching.Memory and Microsoft.Extensions.Options floors to 10.0.6 to match the new CMS transitive requirement. - Enhance get_content_type_schema: each property now exposes its DataTypeKey (Guid) and a JSON Schema (draft 2020-12) describing the exact value shape the editor accepts on write. Schemas are resolved in parallel via IPropertyEditorSchemaService.GetSchemaAsync. - Add get_property_value_schema tool for focused, single-data-type schema lookups (useful for nested element-type data types reached through a block editor). - Embed JSON schemas inline in the Entity Context system prompt via CmsEntityFormatHelper. The Document/Media/Member adapters now inject IPublishedContentTypeCache + IPropertyEditorSchemaService and emit 'editor', 'current value', and 'input shape (JSON Schema)' for every property, so the LLM sees the write-shape next to the read-value and can no longer fall back to its prior knowledge of 'standard' shapes. Frontend (copilot): - Bump @umbraco-cms/backoffice peer to ^17.4.0-rc2 across all five client workspaces. Bump diff to ^9.0.0 to satisfy the new backoffice peer; drop the now-redundant @types/diff (diff@9 ships its own types). - Tighten the set_value tool description: it now leads with a REQUIRED workflow that mandates a schema lookup for any non-string property and warns explicitly that the Entity Context's rendered values are not the input shape. Provider package.lock.json files updated to reflect transitive bumps from the CMS bump. Tests: 741 unit + 25 integration green. Adds coverage for the new schema-embedded responses, the per-property-no-schema fallback, the parallel schema resolution path, the standalone schema-lookup tool, and the Entity Context schema-embedding rendering (including the unknown-content-type fallback). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ache throws Two bugs surfaced in live use: 1. The Entity Context block surfaces the content type as a GUID, so the LLM naturally passes the GUID into get_content_type_schema. The tool only handled aliases, so it called the cache's Get(itemType, alias) overload with a GUID string and the lookup blew up. 2. IPublishedContentTypeCache.Get throws (rather than returning null) when the alias/key isn't registered for the requested item type. The previous '?? Get(Element) ?? Get(Media)' fall-through chain short-circuited on the first throw and bubbled an unhandled exception out of the tool. Renames the arg to ContentTypeAliasOrKey, parses GUIDs first and uses the Guid overload when matched, and wraps every cache lookup in try/catch so misses fall through cleanly to a 'not found' result. Description updated to make the dual-input expectation explicit. Adds regression tests for the GUID path and the throwing-cache fall- through. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The CMS 17.4.0-rc2 block-editor schema only emits element-type GUIDs in contentTypeKey.enum and types values[].value as 'any'. Without aliases or per-element-type property schemas in the payload, the LLM has no source of truth for which blocks are allowed - so when asked to list block types it hallucinates names from training-data priors. Adds BlockSchemaEnricher: walks any schema returned by IPropertyEditorSchemaService and, next to every contentTypeKey.enum, attaches an x-allowedElementTypes array. Each entry resolves the GUID to its element type via IPublishedContentTypeCache and inlines the alias plus a 'properties' array carrying alias, editorAlias and the recursive valueSchema for each property the element type defines. Recursion is depth-bounded (default 1) so nested blocks surface their GUID enums but don't fan out further - the LLM can call get_property_value_schema to drill deeper if it needs to. Element-type lookups fall through Element -> Content buckets and survive the cache's not-registered throws cleanly. Wired into: - GetContentTypeSchemaTool (per-property valueSchema in the result) - GetPropertyValueSchemaTool (the focused single-data-type result) - CmsEntityFormatHelper (the inline schema in the Entity Context block) Five enricher tests cover: real-shaped block list payload, depth boundary preserves nested GUID enums verbatim, non-block schemas pass through untouched, null input returns null, unknown GUIDs are skipped without breaking siblings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…enrichment tree get_content_type_schema and get_property_value_schema were returning [serialization error] [unknown:ErrorContent] from the chat pipeline when the target content type contained block list / block grid (and likely rich text in 17.4.0) properties. Root cause: BlockSchemaEnricher was attaching the JsonObject returned by IPropertyEditorSchemaService .GetValueSchema directly as a child of a freshly constructed JsonObject literal. JsonNode tracks parents and refuses re-attachment, so any shared / cached / re-used sub-tree from the schema service surfaced later as a serialisation failure once MEAI's FunctionInvokingChatClient tried to write the tool result. Defensive fix: clone the schema-service output through ToJsonString() / JsonNode.Parse before mutating or attaching it. Cloning is cheap (these schemas are tiny) and keeps the enricher independent of the schema service's lifetime semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous enrichment recursively inlined every allowed element type's full property schemas into each contentTypeKey.enum site. That solved the LLM hallucination problem but bloated every payload — particularly the Entity Context block, which is loaded into every chat turn — and duplicated information when a doc type used the same element type across multiple block-list properties. Switching to the same lean / lazy-resolution pattern the CMS itself adopted (data-type endpoints pass-through, document-type endpoints use external $ref URIs): - x-allowedElementTypes now carries only { key, alias } per allowed element type. Enough for the LLM to list block types correctly without guessing names. - A new x-allowedElementTypesNote sibling tells the LLM to call get_content_type_schema with an element type's key when it needs the element type's property schemas to author a block of that type. - get_content_type_schema already accepts both alias and GUID, so no new tool is needed for the follow-up lookup. Drops the IPropertyEditorSchemaService argument from BlockSchemaEnricher.Enrich (no longer recursing into property schemas). The three call sites (GetContentTypeSchemaTool, GetPropertyValueSchemaTool, CmsEntityFormatHelper) updated to the two-arg call. Tool descriptions updated to spell out the lookup pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… set_value Reports of 'add a hero block' wiping existing block list contents: set_value REPLACES the property value (the underlying workspace setValue overwrites), but the tool description didn't say so. The LLM was constructing a single-item array and calling set_value with it, deleting whatever was already there. Schema fixes can't address this — JSON Schema describes shape, not mutation semantics. Tool description is the right place. Adds an explicit 'REQUIRED WORKFLOW (collections)' section that: - States plainly that set_value REPLACES, not appends or merges. - Lists the user verbs that imply additive intent (add, append, insert, include, also). - Mandates the read-merge-write pattern: read current value (Entity Context or get_umbraco_content), append/merge in code, send the FULL array, preserving existing keys/contentKeys. - Calls out the symmetrical 'remove' and 'reorder' cases so the LLM applies the same pattern when the user asks to delete or reorder a single item. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ontent]' When a tool throws, MEAI's FunctionInvokingChatClient catches the exception, sanitises it into a generic FunctionResultContent (because IncludeDetailedErrors defaults to false), and the AG-UI chat trace renders that as '[unknown:ErrorContent]' with no text. The exception itself never lands anywhere visible by default — not in the trace, not in the chat history, and not in our logs unless the tool's own code happened to log before throwing. Wraps every typed tool invocation in a try/catch inside AIToolFunction<TArgs>.InvokeCoreAsync that: - Logs the exception (type, message, args JSON) via an injected ILoggerFactory so the underlying cause always lands in the Umbraco.AI.Tools.<toolName> log category, regardless of which path consumed the result. - Returns a structured ToolInvocationError record in place of the thrown exception. The chat trace renders it as a regular tool result (success: false, errorType, message), and the LLM gets a diagnosable payload it can either retry from or report to the user. - Lets OperationCanceledException propagate so cancellation still works cleanly through MEAI. AIFunctionFactory now accepts an optional ILoggerFactory and threads it into typed tool functions. DI already exposes ILoggerFactory, so the existing AddSingleton<IAIFunctionFactory, AIFunctionFactory> registration picks it up without further changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…silently Reports of '[unknown:ErrorContent]' appearing in the chat trace AFTER successful tool calls, with nothing in the logs and no visible cause. Root cause: AGUIStreamingService.StreamCoreAsync only matched FunctionCallContent and FunctionResultContent in its switch over update.Contents. Any other AIContent subtype - including MEAI's ErrorContent, which providers stream for non-fatal mid-response errors (content filters, transient model errors, function-invocation sanitisation when IncludeDetailedErrors is false) - fell through unhandled. The chat trace UI surfaced it from a different path and labelled it '[unknown:ErrorContent]' with no body, while our backend logged nothing. - Adds an explicit ErrorContent branch that logs the Code, Message and Details at Error level under the AGUIStreamingService category, then emits an inline TextChunk like '[Provider error CODE: message]' so the user sees what happened. ErrorContent is documented as non-fatal, so the run continues; we deliberately do NOT emit a RunErrorEvent (which would abort the run on the frontend). - Adds an explicit TextContent branch that no-ops to avoid double-emit with the existing update.Text aggregation. - Adds a default branch that logs unhandled AIContent subtypes at debug level - future-proofs against MEAI adding new content types we don't yet recognise. Test added: ErrorContent appearing between text updates produces three text chunks (the surrounding text plus the marker carrying the error code and message), no RunErrorEvent, RunFinishedEvent emitted normally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tead of narrating After switching from GPT to Sonnet 4.6, the model began acknowledging set_value in its replies ('I will set the value to ...', 'I have set the value to ...') without ever emitting a tool_use block. The same description that nudged GPT-4o into invoking the tool pushed Sonnet into describing the procedure. Anthropic's tool-use guidance recommends concise, function-signature- style descriptions. The previous text had grown to ~1500 chars of 'REQUIRED WORKFLOW' / 'MUST' prose with multi-step procedures spelled out in narrative form, which Sonnet treats as a process spec to comply with rather than a callable function. Trimmed to a short summary plus two compact constraint notes (replace- not-merge for collections, schema-first for non-string editors). Behaviour-preserving: every constraint that was in the long version is still represented, just expressed once and tersely. GPT will still pick it up, Sonnet now sees a function rather than a workflow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…reaming pipeline Reports of '17-second gap then RUN_FINISHED with no TOOL_CALL_CHUNK events' on Sonnet 4.6: tool_use is being generated by the model but something between the provider and the AG-UI emitter is eating it. Three plausible suspects (AIToolReorderingChatClient buffering, FunctionInvokingChatClient consumption, AIFrontendToolFunction termination) and no way to tell which today. Adds an Information-level log line at each hop so the next reproduction narrows the breakage in one run: - AIToolReorderingChatClient logs the buffered tool names and the registered frontend tool names when it has any tool-call updates to forward at end-of-stream. - AGUIStreamingService logs every FunctionCallContent it receives, including which run, which tool, and whether that tool is registered as a frontend tool. - AIFrontendToolFunction logs at Information when InvokeCoreAsync fires and at Warning if FunctionInvokingChatClient.CurrentContext is null (which would mean the Terminate=true signal can't propagate). Decision tree for the next reproduction: - All three log → tool call did surface, frontend rendering is the problem. - Reordering+streaming log, AIFrontendToolFunction does NOT → MEAI bypasses the frontend function (e.g. tool_choice routing, tool name mismatch, or a wiring issue). - Reordering logs, AGUIStreamingService does NOT → FunctionInvoking- ChatClient is consuming the call without forwarding. - Nothing logs → the provider really isn't surfacing FunctionCallContent at all (Sonnet still narrating, or an Anthropic-provider streaming quirk). ILoggerFactory threaded as an optional last constructor arg through AIAgentService, AGUIToolConverter, AIToolReorderingChatMiddleware and AIFrontendToolFunction so existing tests keep working unchanged. DI already exposes ILoggerFactory, so the live wiring picks it up automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Resolves Directory.Packages.props (taking dev's 10.5.2 M.E.AI and 10.0.7 floors) and bumps Umbraco.Cms.* floor from 17.4.0-rc2 to 17.4.0 now that the stable release is on NuGet. Updates @umbraco-cms/backoffice peer to ^17.4.0 across all 5 client package.json files and regenerates lockfiles.

Brings in the CMS 17.4.0 stable bump from #153 and rolls 17.4.0-rc2 references in this branch forward to 17.4.0. Updates the EF Core comment to reflect that 17.4.0 stable still bundles EF Core 10.0.6 (same as rc2).

mattbrailsford and others added 10 commits May 5, 2026 15:45

This was referenced May 7, 2026

feat: Property value operation handlers + generic AI tools #160

Merged

feat: AI save and save_and_publish tools #166

Merged

mattbrailsford merged commit 50aedf7 into dev May 14, 2026
13 of 23 checks passed

mattbrailsford deleted the feature/cms-1740-schema-aware-tools branch May 14, 2026 07:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schema-aware property tooling for AI agents (CMS 17.4.0)#153

Schema-aware property tooling for AI agents (CMS 17.4.0)#153
mattbrailsford merged 11 commits into
devfrom
feature/cms-1740-schema-aware-tools

mattbrailsford commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mattbrailsford commented May 6, 2026

Summary

Out of scope / known limitations

Detail by area

Dependency bump

Schema-aware tools (Umbraco.AI/src/Umbraco.AI.Core/Tools/Umbraco/)

Entity Context schema embedding

set_value description

Failure visibility

Test plan

Commits

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Schema-aware tools (`Umbraco.AI/src/Umbraco.AI.Core/Tools/Umbraco/`)

`set_value` description