feat(agentic-ai): custom llm layer (HACKDAYS)#7151
Draft
maff wants to merge 81 commits into
Draft
Conversation
maff
commented
May 7, 2026
Comment on lines
+20
to
+21
| @Nullable Integer maxOutputTokens, | ||
| @Nullable Double temperature, |
Member
Author
There was a problem hiding this comment.
We need to carefully assess which configurations we expose here as native ones as they vary between providers and models (e.g. newer reasoning models do not support temperature anymore).
High level assessment: https://claude.ai/share/ec1531e3-3e0d-4ba9-8cad-713bb9e970b8
maff
commented
May 7, 2026
| * /v1/responses}) with the corresponding request/response schema. | ||
| */ | ||
| @SlowTest | ||
| public class OpenAiResponsesApiAiAgentJobWorkerTests extends BaseWireFormatAiAgentJobWorkerTest { |
Member
Author
There was a problem hiding this comment.
Isn't this OpenAiCompletions...?
|
|
||
| ChatStreamListener NOOP = event -> {}; | ||
|
|
||
| void onEvent(ChatModelEvent event); |
| documentToContentConverter.convert(documentContent.document()); | ||
| case ObjectContent objectContent -> | ||
| new dev.langchain4j.data.message.TextContent(convertToString(objectContent.content())); | ||
| case ReasoningContent ignored -> |
maff
commented
May 7, 2026
| @JsonInclude(JsonInclude.Include.NON_EMPTY) List<Content> content, | ||
| @JsonInclude(JsonInclude.Include.NON_EMPTY) List<ToolCall> toolCalls, | ||
| @Nullable String modelId, | ||
| @Nullable String apiId, |
Member
Author
There was a problem hiding this comment.
Probably we should lift these metadata fields above the existing content/toolCalls.
messageId or apiMessageId maybe?
d24acee to
937884d
Compare
…messages Documents in tool call results were serialized as base64 blobs in the tool result text, which models cannot interpret. This change extracts documents from tool call results into a follow-up UserMessage with proper DocumentContent blocks, making them visible to the LLM via the existing document-to-content conversion path (ImageContent, PdfFileContent, TextContent). - Add ToolCallResultDocumentExtractor to recursively find Document instances in tool call result content trees - Create a document UserMessage (tagged with toolCallDocuments metadata) after the ToolCallResultMessage, before event messages - Extract documents from event messages into DocumentContent blocks - Serialize documents in tool results as document references (default DocumentSerializer) instead of base64 content blocks - Remove stale DocumentToContentSerializer/Module/ResponseModel infrastructure - Simplify ToolCallConverterImpl (remove ContentConverter dependency) - Simplify ContentConverterImpl (remove contentObjectMapper copy) - Update e2e tests to assert new message structure ADR: docs/adr/003-document-handling-in-tool-call-results.md
Use Jackson's BeanDescription to walk arbitrary object properties when extracting documents from tool call results. This ensures documents embedded inside records or POJOs (e.g. MCP content types, gateway integrations) are discovered without coupling to specific return types.
…l results Use the raw content map from process variables in gateway tool call results instead of typed domain objects (McpContent, A2aSendMessageResult). This preserves document references as deserialized by the engine, ensuring documents are properly extracted into user messages. The ToolCallResultDocumentExtractor walks Map/Collection/Document trees — typed records are invisible to it. Gateway handlers now pass raw content through and only use typed conversion for metadata extraction. Simplifies the document extractor back to Map/Collection/Document walking since raw maps are sufficient. Adds e2e test verifying MCP image tool results produce document user messages. Updates ADR 003.
…lers Add job worker MCP e2e test verifying image documents in MCP tool call results are extracted into user messages (mirrors outbound connector test). Add A2A unit test verifying raw content with Document instances is preserved through transformation and discoverable by the extractor.
Handle Object[] in the recursive document extraction switch so documents inside arrays are not silently skipped.
Replace plain-text separators with structured XML tags for document correlation in the synthetic UserMessage. Each document gets a <document> tag with tool name, call ID, short document ID, and filename attributes. Event documents also receive XML labels for consistency. A preamble "Documents extracted from tool call results:" introduces the block.
Log a warning when getRawMcpContent finds a non-empty map without the expected 'content' key, which would indicate a response shape change and silent document loss.
Synthetic UserMessages carrying tool call documents (metadata flag toolCallDocuments=true) no longer count toward the context window limit. They are also evicted together with their associated ToolCallResultMessage to prevent orphaned document messages.
The test was passing a typed McpClientCallToolResult as content, but the engine delivers a raw Map. Convert to Map to match real behavior and assert on the extracted content list.
ADR-003 is reserved for the conversation storage SPI introduced in #6784.
Update document message assertions to match the new format: preamble, XML <document> tags with tool/call-id/document-short-id/filename attributes, and interleaved DocumentContent blocks. Extract documentShortId helper into shared AiAgentTestFixtures.
Promote the metadata key constant to a public field on UserMessage so it can be referenced without duplication from AgentMessagesHandlerImpl, MessageWindowRuntimeMemory, and tests.
- Remove unused DownloadFileToolResult records from E2E test base classes (dead code since introduction) - Rename extractDocuments(Object) to extractDocumentsFromContent(Object) to avoid confusing overload with extractDocuments(List<ToolCallResult>)
Use XML escaping for all attribute values in the document XML tag to prevent malformed tags from filenames or tool names containing special characters (&, <, >, ", '). Add Javadoc for documentShortId and documentXmlTag. Add edge case tests for escaping.
Track effective message count as an int instead of recounting via stream on every eviction iteration.
- add JSON-aware ToolExecutionResultMessageEqualsPredicate for order-independent comparison of tool result JSON in E2E tests - lowercase all inline comments introduced in this PR - replace string concatenation with String.formatted() in XML tag assertions - inline E2E message order comments as suffixes on assertions - assert exact base64 data in MCP image E2E tests - restore chat request count assertions in tool calling E2E tests - strengthen MCP handler test to assert on content values, not just list size - add Javadoc example for extractDocumentShortId showing the document reference JSON format
- update document user message format to XML tags with correlation attributes (tool, call-id, document-short-id, filename) - document the message window memory behavior for document messages - document event document labeling consistency - add future optimization note for UserMessage rebuild strategy - update walker to include Object[] support - fix provider list (remove specific provider references) - delete implementation plan file
Only decrement the effective message count when the evicted message is not a tool-call document message, preventing under-counting if an orphaned document message ends up at the eviction position.
Move XML tag building, attribute escaping, and document short ID extraction into a dedicated DocumentXmlTag record with factory methods and toXml() serialization. Tests moved to DocumentXmlTagTest.
…ll results Add a manual CPT test that validates real LLM providers can receive and reason about PDF documents extracted from tool call results via the synthetic UserMessage with XML correlation tags. The test covers three scenarios with increasing complexity: - Single document returned from a tool call - Multiple documents returned as a list - Documents embedded in a nested Map structure A BPMN process downloads PDFs from WireMock, then uses FEEL script tasks inside an ad-hoc subprocess to assemble tool results of varying shapes. The AI Agent connector processes these with a real LLM, and CPT judge assertions verify the model correctly extracted facts from the PDFs. Provider configs (toggled via env vars): OpenAI, Anthropic, AWS Bedrock, and OpenAI-compatible (Docker Model Runner). The test is @disabled by default and not part of CI.
Add Ollama provider configs (qwen3.5, llama3.1:8b) with OLLAMA_URL env var. Add .disabled() toggle on ProviderConfig and a modelFilters allowlist for quickly focusing test runs on specific models without commenting code.
Add backend selection dropdowns to the AI Agent element template for all three configurable providers: - Anthropic: Direct / AWS Bedrock / Google Vertex AI / Azure AI Foundry - Google GenAI: Vertex AI / Developer API (Google AI Studio) - OpenAI: OpenAI / Azure AI Foundry / Custom Version bumped from 11 to 12 in @ElementTemplate; v11 templates backed up to element-templates/versioned/. README updated to reflect v12 as current.
Phase E3+E4 (ToolCallResultStrategy + native multimodal emission) and Phase F (provider configuration restructure + Jackson migration deserializer + element template v11→v12) have both landed on this branch since the last plan revision. Update the Actual state section, mark the affected per-phase headers as done, refresh the Critical files table, and note the one wiring deviation in Phase F (type-level @JsonDeserialize instead of a Jackson Module, because the connector runtime's @ConnectorsObjectMapper does not do Module bean discovery). ADR status gains a prototype note pointing at this PR while staying Proposed; the design is being re-landed on main as smaller follow-up issues, and Status will move to Implemented once Phase G + Phase H land there.
This was referenced May 13, 2026
76f3677 to
16ab8d3
Compare
16ab8d3 to
51e4d96
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This branch explores replacing the LangChain4j-backed
AiFrameworkAdapterinagentic-aiwith a native provider layer built on the official vendor Java SDKs (Anthropic, OpenAI, AWS Bedrock, Google GenAI), and adds a capability matrix + tool-call-result routing strategy so the agent can use provider-native multimodal features instead of the lowest-common-denominator surface today's framework offers.What lives in this PR
connectors/agentic-ai/docs/adr/005-replace-langchain4j-framework.md. Captures context, decision drivers, options considered, final architecture (ChatClientfacade →ChatModelApiper wire protocol → vendor SDKs), domain-model extensions, capability matrix design, tool-call-result routing strategy, reasoning + caching design, provider configuration restructure with Jackson migration, and phased migration plan.connectors/agentic-ai/docs/adr-005-implementation-plan.md. Breaks Phase 1 of the ADR into eight green-build checkpoints (Phases A–H), with file lists, test plans, and verification commands per phase. Marks each phase as done or open against this branch.main.What the branch actually implements
Following the phase numbering in the implementation plan:
ReasoningContent,AssistantMessage.{modelId, messageId, stopReason, usage},ToolCallResult.contentBlocks,TokenUsage.{cacheRead, cacheCreation, reasoning}Tokens, droprawChatResponse())framework/api/(ChatClient,ChatModelApi,ChatModelApiFactory,ChatModelApiRegistry,ChatRequest,ChatResponse,ChatOptions,ModelCapabilities,ChatStreamListener,ChatModelEvent);BaseAgentRequestHandlerroutes throughChatClient; LangChain4j wired as bridgeChatModelApifor all provider discriminatorsAnthropicMessagesChatModelApi(DIRECT backend)OpenAiChatCompletionsChatModelApiOpenAiResponsesChatModelApi+apiFamilyswitch onopenaidiscriminator + element template bump 10 → 11PropertySource; bundledmodel-capabilities.yaml; 4-step resolution chain (override → exact id/alias → glob pattern → conservative defaults)ModelCapabilitiesresolved at factory time viaModelCapabilitiesResolverToolCallResultStrategy(single-pass per-document routing) + native multimodal emission (images + PDFs) in all three native impls;AgentMessagesHandlerImplno longer extracts documents;DocumentModalityMIME mapping (Modality.PDF→Modality.DOCUMENT)ProviderConfigurationrestructure by wire format: Anthropic gains sealedAnthropicAuthentication+AnthropicBackend { DIRECT, BEDROCK, VERTEX, FOUNDRY }; OpenAI consolidates the three legacy configs (openai/azureOpenAi/openaiCompatible) into one withOpenAiBackend { OPENAI, FOUNDRY, CUSTOM }; Google renamedgoogleVertexAi→googleGenAiwithGoogleBackend { DEVELOPER_API, VERTEX }; Bedrock validates non-Anthropic model IDs.ProviderConfigurationDeserializerwired via type-level@JsonDeserialize(not via JacksonModule— the connector runtime's@ConnectorsObjectMapperdoesn't do module discovery). Element template bumped 11 → 12 with backend dropdowns + conditional auth groupsAnthropicMessagesChatModelApiFactoryrejects any backend != DIRECT; nativeOpenAiChatModelApiFactorythrows a "Phase G Azure SDK integration" placeholder for FOUNDRY + clientCredentials;bedrockandgoogleGenAidiscriminators still served via L4J bridgecamunda.connector.agenticai.framework=langchain4j)Deferred out of Phase E and tracked separately: reasoning (signed thinking blocks, encrypted reasoning items), prompt caching (
cache_control,cachePoint,prompt_cache_key), audio/video modalities, JDKjava.net.http.HttpClientadapter (replacing OkHttp transport).Built on top of #6999 (
agentic-ai-document-tool-call-results) — Phase E reuses the document walker, per-handler extraction hooks, XML tag generator, and synthetic-message window-count handling from that PR unchanged.Known sequencing gotcha for re-landing on main
Phase F's migration deserializer rewrites saved
{bedrock, anthropic.claude-…}configurations to{anthropic, backend: bedrock}. The nativeAnthropicMessagesChatModelApiFactoryrejects any backend != DIRECT until Phase G's Anthropic cloud backends ship. Landing F before G onmainwould therefore break existing customers running Claude on Bedrock via the framework. When breaking F into a follow-up issue, either split it (OpenAI/Google restructure first, Anthropic backend discriminator paired with the Anthropic-cloud-backends issue) or gate the bedrock→anthropic-backend migration row until G is in place.How we plan to land this
This PR will not be merged. Each phase above (and each deferred topic) is being split into its own issue under a parent tracking issue. Issues will reference the ADR + implementation plan as the design source of truth, and each one is sized to fit a normal review cycle.
The branch stays available as a reference implementation while the follow-up issues are picked up.
Related issues
Parent tracking epic: #7211 — AI Agent: Own the LLM layer. Re-lands the design in three phases (#7212, #7213, #7214) with 19 implementation sub-issues.
Checklist