Schema-aware property tooling for AI agents (CMS 17.4.0)#153
Merged
Conversation
Bumps the CMS dependency floor to 17.4.0-rc2 and uses the new IValueSchemaProvider / IPropertyEditorSchemaService APIs so AI agents can produce correctly-shaped values for any property editor — including block list, block grid, and media picker — without guessing. Backend (Umbraco.AI core): - Bump every Umbraco.Cms.* PackageVersion floor from [17.3.0, ...) to [17.4.0-rc2, ...). Bump Microsoft.Extensions.Caching.Memory and Microsoft.Extensions.Options floors to 10.0.6 to match the new CMS transitive requirement. - Enhance get_content_type_schema: each property now exposes its DataTypeKey (Guid) and a JSON Schema (draft 2020-12) describing the exact value shape the editor accepts on write. Schemas are resolved in parallel via IPropertyEditorSchemaService.GetSchemaAsync. - Add get_property_value_schema tool for focused, single-data-type schema lookups (useful for nested element-type data types reached through a block editor). - Embed JSON schemas inline in the Entity Context system prompt via CmsEntityFormatHelper. The Document/Media/Member adapters now inject IPublishedContentTypeCache + IPropertyEditorSchemaService and emit 'editor', 'current value', and 'input shape (JSON Schema)' for every property, so the LLM sees the write-shape next to the read-value and can no longer fall back to its prior knowledge of 'standard' shapes. Frontend (copilot): - Bump @umbraco-cms/backoffice peer to ^17.4.0-rc2 across all five client workspaces. Bump diff to ^9.0.0 to satisfy the new backoffice peer; drop the now-redundant @types/diff (diff@9 ships its own types). - Tighten the set_value tool description: it now leads with a REQUIRED workflow that mandates a schema lookup for any non-string property and warns explicitly that the Entity Context's rendered values are not the input shape. Provider package.lock.json files updated to reflect transitive bumps from the CMS bump. Tests: 741 unit + 25 integration green. Adds coverage for the new schema-embedded responses, the per-property-no-schema fallback, the parallel schema resolution path, the standalone schema-lookup tool, and the Entity Context schema-embedding rendering (including the unknown-content-type fallback). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ache throws Two bugs surfaced in live use: 1. The Entity Context block surfaces the content type as a GUID, so the LLM naturally passes the GUID into get_content_type_schema. The tool only handled aliases, so it called the cache's Get(itemType, alias) overload with a GUID string and the lookup blew up. 2. IPublishedContentTypeCache.Get throws (rather than returning null) when the alias/key isn't registered for the requested item type. The previous '?? Get(Element) ?? Get(Media)' fall-through chain short-circuited on the first throw and bubbled an unhandled exception out of the tool. Renames the arg to ContentTypeAliasOrKey, parses GUIDs first and uses the Guid overload when matched, and wraps every cache lookup in try/catch so misses fall through cleanly to a 'not found' result. Description updated to make the dual-input expectation explicit. Adds regression tests for the GUID path and the throwing-cache fall- through. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The CMS 17.4.0-rc2 block-editor schema only emits element-type GUIDs in contentTypeKey.enum and types values[].value as 'any'. Without aliases or per-element-type property schemas in the payload, the LLM has no source of truth for which blocks are allowed - so when asked to list block types it hallucinates names from training-data priors. Adds BlockSchemaEnricher: walks any schema returned by IPropertyEditorSchemaService and, next to every contentTypeKey.enum, attaches an x-allowedElementTypes array. Each entry resolves the GUID to its element type via IPublishedContentTypeCache and inlines the alias plus a 'properties' array carrying alias, editorAlias and the recursive valueSchema for each property the element type defines. Recursion is depth-bounded (default 1) so nested blocks surface their GUID enums but don't fan out further - the LLM can call get_property_value_schema to drill deeper if it needs to. Element-type lookups fall through Element -> Content buckets and survive the cache's not-registered throws cleanly. Wired into: - GetContentTypeSchemaTool (per-property valueSchema in the result) - GetPropertyValueSchemaTool (the focused single-data-type result) - CmsEntityFormatHelper (the inline schema in the Entity Context block) Five enricher tests cover: real-shaped block list payload, depth boundary preserves nested GUID enums verbatim, non-block schemas pass through untouched, null input returns null, unknown GUIDs are skipped without breaking siblings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…enrichment tree get_content_type_schema and get_property_value_schema were returning [serialization error] [unknown:ErrorContent] from the chat pipeline when the target content type contained block list / block grid (and likely rich text in 17.4.0) properties. Root cause: BlockSchemaEnricher was attaching the JsonObject returned by IPropertyEditorSchemaService .GetValueSchema directly as a child of a freshly constructed JsonObject literal. JsonNode tracks parents and refuses re-attachment, so any shared / cached / re-used sub-tree from the schema service surfaced later as a serialisation failure once MEAI's FunctionInvokingChatClient tried to write the tool result. Defensive fix: clone the schema-service output through ToJsonString() / JsonNode.Parse before mutating or attaching it. Cloning is cheap (these schemas are tiny) and keeps the enricher independent of the schema service's lifetime semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous enrichment recursively inlined every allowed element type's
full property schemas into each contentTypeKey.enum site. That solved
the LLM hallucination problem but bloated every payload — particularly
the Entity Context block, which is loaded into every chat turn — and
duplicated information when a doc type used the same element type
across multiple block-list properties.
Switching to the same lean / lazy-resolution pattern the CMS itself
adopted (data-type endpoints pass-through, document-type endpoints use
external $ref URIs):
- x-allowedElementTypes now carries only { key, alias } per allowed
element type. Enough for the LLM to list block types correctly
without guessing names.
- A new x-allowedElementTypesNote sibling tells the LLM to call
get_content_type_schema with an element type's key when it needs the
element type's property schemas to author a block of that type.
- get_content_type_schema already accepts both alias and GUID, so no
new tool is needed for the follow-up lookup.
Drops the IPropertyEditorSchemaService argument from
BlockSchemaEnricher.Enrich (no longer recursing into property schemas).
The three call sites (GetContentTypeSchemaTool,
GetPropertyValueSchemaTool, CmsEntityFormatHelper) updated to the
two-arg call. Tool descriptions updated to spell out the lookup
pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… set_value Reports of 'add a hero block' wiping existing block list contents: set_value REPLACES the property value (the underlying workspace setValue overwrites), but the tool description didn't say so. The LLM was constructing a single-item array and calling set_value with it, deleting whatever was already there. Schema fixes can't address this — JSON Schema describes shape, not mutation semantics. Tool description is the right place. Adds an explicit 'REQUIRED WORKFLOW (collections)' section that: - States plainly that set_value REPLACES, not appends or merges. - Lists the user verbs that imply additive intent (add, append, insert, include, also). - Mandates the read-merge-write pattern: read current value (Entity Context or get_umbraco_content), append/merge in code, send the FULL array, preserving existing keys/contentKeys. - Calls out the symmetrical 'remove' and 'reorder' cases so the LLM applies the same pattern when the user asks to delete or reorder a single item. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ontent]' When a tool throws, MEAI's FunctionInvokingChatClient catches the exception, sanitises it into a generic FunctionResultContent (because IncludeDetailedErrors defaults to false), and the AG-UI chat trace renders that as '[unknown:ErrorContent]' with no text. The exception itself never lands anywhere visible by default — not in the trace, not in the chat history, and not in our logs unless the tool's own code happened to log before throwing. Wraps every typed tool invocation in a try/catch inside AIToolFunction<TArgs>.InvokeCoreAsync that: - Logs the exception (type, message, args JSON) via an injected ILoggerFactory so the underlying cause always lands in the Umbraco.AI.Tools.<toolName> log category, regardless of which path consumed the result. - Returns a structured ToolInvocationError record in place of the thrown exception. The chat trace renders it as a regular tool result (success: false, errorType, message), and the LLM gets a diagnosable payload it can either retry from or report to the user. - Lets OperationCanceledException propagate so cancellation still works cleanly through MEAI. AIFunctionFactory now accepts an optional ILoggerFactory and threads it into typed tool functions. DI already exposes ILoggerFactory, so the existing AddSingleton<IAIFunctionFactory, AIFunctionFactory> registration picks it up without further changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…silently Reports of '[unknown:ErrorContent]' appearing in the chat trace AFTER successful tool calls, with nothing in the logs and no visible cause. Root cause: AGUIStreamingService.StreamCoreAsync only matched FunctionCallContent and FunctionResultContent in its switch over update.Contents. Any other AIContent subtype - including MEAI's ErrorContent, which providers stream for non-fatal mid-response errors (content filters, transient model errors, function-invocation sanitisation when IncludeDetailedErrors is false) - fell through unhandled. The chat trace UI surfaced it from a different path and labelled it '[unknown:ErrorContent]' with no body, while our backend logged nothing. - Adds an explicit ErrorContent branch that logs the Code, Message and Details at Error level under the AGUIStreamingService category, then emits an inline TextChunk like '[Provider error CODE: message]' so the user sees what happened. ErrorContent is documented as non-fatal, so the run continues; we deliberately do NOT emit a RunErrorEvent (which would abort the run on the frontend). - Adds an explicit TextContent branch that no-ops to avoid double-emit with the existing update.Text aggregation. - Adds a default branch that logs unhandled AIContent subtypes at debug level - future-proofs against MEAI adding new content types we don't yet recognise. Test added: ErrorContent appearing between text updates produces three text chunks (the surrounding text plus the marker carrying the error code and message), no RunErrorEvent, RunFinishedEvent emitted normally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tead of narrating
After switching from GPT to Sonnet 4.6, the model began acknowledging
set_value in its replies ('I will set the value to ...', 'I have set
the value to ...') without ever emitting a tool_use block. The same
description that nudged GPT-4o into invoking the tool pushed Sonnet
into describing the procedure.
Anthropic's tool-use guidance recommends concise, function-signature-
style descriptions. The previous text had grown to ~1500 chars of
'REQUIRED WORKFLOW' / 'MUST' prose with multi-step procedures spelled
out in narrative form, which Sonnet treats as a process spec to comply
with rather than a callable function.
Trimmed to a short summary plus two compact constraint notes (replace-
not-merge for collections, schema-first for non-string editors).
Behaviour-preserving: every constraint that was in the long version is
still represented, just expressed once and tersely. GPT will still pick
it up, Sonnet now sees a function rather than a workflow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…reaming pipeline Reports of '17-second gap then RUN_FINISHED with no TOOL_CALL_CHUNK events' on Sonnet 4.6: tool_use is being generated by the model but something between the provider and the AG-UI emitter is eating it. Three plausible suspects (AIToolReorderingChatClient buffering, FunctionInvokingChatClient consumption, AIFrontendToolFunction termination) and no way to tell which today. Adds an Information-level log line at each hop so the next reproduction narrows the breakage in one run: - AIToolReorderingChatClient logs the buffered tool names and the registered frontend tool names when it has any tool-call updates to forward at end-of-stream. - AGUIStreamingService logs every FunctionCallContent it receives, including which run, which tool, and whether that tool is registered as a frontend tool. - AIFrontendToolFunction logs at Information when InvokeCoreAsync fires and at Warning if FunctionInvokingChatClient.CurrentContext is null (which would mean the Terminate=true signal can't propagate). Decision tree for the next reproduction: - All three log → tool call did surface, frontend rendering is the problem. - Reordering+streaming log, AIFrontendToolFunction does NOT → MEAI bypasses the frontend function (e.g. tool_choice routing, tool name mismatch, or a wiring issue). - Reordering logs, AGUIStreamingService does NOT → FunctionInvoking- ChatClient is consuming the call without forwarding. - Nothing logs → the provider really isn't surfacing FunctionCallContent at all (Sonnet still narrating, or an Anthropic-provider streaming quirk). ILoggerFactory threaded as an optional last constructor arg through AIAgentService, AGUIToolConverter, AIToolReorderingChatMiddleware and AIFrontendToolFunction so existing tests keep working unchanged. DI already exposes ILoggerFactory, so the live wiring picks it up automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 7, 2026
Resolves Directory.Packages.props (taking dev's 10.5.2 M.E.AI and 10.0.7 floors) and bumps Umbraco.Cms.* floor from 17.4.0-rc2 to 17.4.0 now that the stable release is on NuGet. Updates @umbraco-cms/backoffice peer to ^17.4.0 across all 5 client package.json files and regenerates lockfiles.
mattbrailsford
added a commit
that referenced
this pull request
May 14, 2026
Brings in the CMS 17.4.0 stable bump from #153 and rolls 17.4.0-rc2 references in this branch forward to 17.4.0. Updates the EF Core comment to reflect that 17.4.0 stable still bundles EF Core 10.0.6 (same as rc2).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
IValueSchemaProvider/IPropertyEditorSchemaServiceAPIs so AI agents have a programmatic source of truth for the value shape every property editor accepts on write.get_content_type_schema, available standalone viaget_property_value_schema, and inlined into the Entity Context block of the system prompt so the LLM sees the write-shape next to the rendered current value without an extra tool call.x-allowedElementTypesenrichment to block-list / block-grid / rich-text-with-blocks schemas (the CMS only emits element-type GUIDs, no aliases) plus a guidance note pointing the LLM atget_content_type_schemafor follow-up element-type schemas. Closes the LLM hallucination of block names.set_valuefrontend tool description to be explicit about replace semantics, the read-merge-write requirement for collections, and the schema-first workflow for non-string editors.ToolInvocationErrorresults instead of opaque[unknown:ErrorContent], MEAIErrorContent(provider rate-limits, content filters) is logged and rendered inline, and unhandledAIContentsubtypes log at debug level so nothing else can disappear silently.AIToolReorderingChatClient,AGUIStreamingService,AIFrontendToolFunction) so the next time a tool call goes missing the cause is one log filter away.Out of scope / known limitations
The schema-discovery and write-friendly tooling makes the LLM able to author block-list / block-grid values, but not always willing — Anthropic Sonnet 4.6 in particular sometimes substitutes narration ("I'll add the block...") for a
tool_useemission when the structured argument is large. The pipeline routes the call correctly when it does emit one (verified by the new diagnostics on a Sonnet vs. GPT cross-check). This is a tool-granularity issue, not a pipeline bug:set_valueasks the model to author the full layout/contentData/expose envelope with cross-referenced GUIDs, which models are genuinely bad at. Follow-up issue: a family of frontend domain-operation tools (add_block,remove_block,set_block_property, etc.) that take simpler inputs and do the envelope assembly client-side. Not in this PR.Detail by area
Dependency bump
Directory.Packages.props: everyUmbraco.Cms.*floor bumped from[17.3.0, 17.999.999)to[17.4.0-rc2, 17.999.999). RC version pinning is required because npm/NuGet semver excludes prereleases from stable lower bounds.Microsoft.Extensions.Caching.MemoryandMicrosoft.Extensions.Optionsfloors raised to 10.0.6 to satisfy the new CMS transitive requirement.package.jsonfiles (root + 5 client workspaces):@umbraco-cms/backofficepeer bumped to^17.4.0-rc2.diffbumped to^9.0.0to satisfy the new backoffice peer;@types/diffremoved (diff@9 ships its own types).packages.lock.jsonfiles updated to reflect transitive bumps.[17.4.0, 17.999.999)/^17.4.0.Schema-aware tools (
Umbraco.AI/src/Umbraco.AI.Core/Tools/Umbraco/)GetContentTypeSchemaToolaccepts both alias and GUID, falls throughContent→Element→Media→Memberbuckets, and survives the cache's not-registered throws. Each property in the response now carries itsDataTypeKey(Guid) andValueSchema(JsonObject) — schemas resolved in parallel viaIPropertyEditorSchemaService.GetSchemaAsync.GetPropertyValueSchemaTool— new tool for focused single-data-type schema lookups. Friendly messages forDataTypeNotFoundandSchemaNotSupportedoperation statuses.BlockSchemaEnricher— walks any schema the schema service produced and attachesx-allowedElementTypes({ key, alias }per allowed element type) andx-allowedElementTypesNotenext to everycontentTypeKey.enum. Shallow / lazy by design — the LLM follows up viaget_content_type_schema(elementTypeKey)when it actually needs property schemas, mirroring the CMS's own$reflazy-resolution philosophy. Defensive deep-clone before mutating preventsJsonNodeparent-tracking serialisation failures.Entity Context schema embedding
CmsEntityFormatHelper.FormatCmsEntitynow accepts the published-content-type cache + schema service and inlines per-propertyeditor,current value, andinput shape (JSON Schema)into the Entity Context system prompt.Document/Media/MemberEntityAdapterinject the services from DI. Existing fall-back rendering preserved when services are absent.set_valuedescriptionFailure visibility
AIToolFunction<TArgs>.InvokeCoreAsyncwraps tool execution in try/catch that logs (Umbraco.AI.Tools.<toolName>) and returns aToolInvocationError { success, toolName, errorType, message }record instead of bubbling. Replaces the opaque MEAI default of[unknown:ErrorContent]with no body.AGUIStreamingService.StreamCoreAsynchandles MEAIErrorContent(logs + emits inline as a[Provider error <code>: <message>]text chunk so the run continues per MEAI docs), explicitly no-opsTextContentto avoid double-emit withupdate.Text, and logs unhandled content types at debug.AIToolReorderingChatClient(buffered tool names + registered frontend tools),AGUIStreamingService(everyFunctionCallContentreceived), andAIFrontendToolFunction(when the frontend-tool function fires + a warning whenFunctionInvokingChatClient.CurrentContextis null).Test plan
get_content_type_schemareturns each property'sValueSchemaand embedsx-allowedElementTypeson block propertieseditor/current value/input shapeper propertyset_valuewithout manually reading the schemaIValueSchemaProviderstill renders correctly withValueSchema: nullDirectory.Packages.propsCommits
feat(core,copilot,deps): schema-aware property toolingfix(core): GUID input + throwing-cache resilience onget_content_type_schemafeat(core): block list/gridx-allowedElementTypesenrichment (initially full)fix(core): deep-clone inner element-type schemas before attachingrefactor(core): shallow + lazy enrichment (drops recursive property schemas)fix(copilot): replace + read-merge requirements onset_valuefix(core): surface tool exceptions instead of[unknown:ErrorContent]fix(agent): surface MEAIErrorContentin AG-UI streamfix(copilot): tightenset_valuedescription for Sonnetchore(agent): diagnostic logging at every hop in the tool-call streaming pipeline