feat(agentic-ai): custom llm layer (HACKDAYS) by maff · Pull Request #7151 · camunda/connectors

maff · 2026-05-07T07:53:51Z

Description

HACKDAYS project — not for merge in its current shape. The deliverables we want to keep from this branch are the ADR + implementation plan; the code is a working reference implementation we will rebuild in small, reviewable chunks via the follow-up issues linked below.

This branch explores replacing the LangChain4j-backed AiFrameworkAdapter in agentic-ai with a native provider layer built on the official vendor Java SDKs (Anthropic, OpenAI, AWS Bedrock, Google GenAI), and adds a capability matrix + tool-call-result routing strategy so the agent can use provider-native multimodal features instead of the lowest-common-denominator surface today's framework offers.

What lives in this PR

ADR-005 — connectors/agentic-ai/docs/adr/005-replace-langchain4j-framework.md. Captures context, decision drivers, options considered, final architecture (ChatClient facade → ChatModelApi per wire protocol → vendor SDKs), domain-model extensions, capability matrix design, tool-call-result routing strategy, reasoning + caching design, provider configuration restructure with Jackson migration, and phased migration plan.
Implementation plan — connectors/agentic-ai/docs/adr-005-implementation-plan.md. Breaks Phase 1 of the ADR into eight green-build checkpoints (Phases A–H), with file lists, test plans, and verification commands per phase. Marks each phase as done or open against this branch.
Reference implementation of the phases listed below. Useful as a worked example when building out the follow-up tickets, but not the form we want to land on main.

What the branch actually implements

Following the phase numbering in the implementation plan:

Phase	Scope	State on this branch
0	Domain model extensions (`ReasoningContent`, `AssistantMessage.{modelId, messageId, stopReason, usage}`, `ToolCallResult.contentBlocks`, `TokenUsage.{cacheRead, cacheCreation, reasoning}Tokens`, drop `rawChatResponse()`)	done
A	New SPI under `framework/api/` (`ChatClient`, `ChatModelApi`, `ChatModelApiFactory`, `ChatModelApiRegistry`, `ChatRequest`, `ChatResponse`, `ChatOptions`, `ModelCapabilities`, `ChatStreamListener`, `ChatModelEvent`); `BaseAgentRequestHandler` routes through `ChatClient`; LangChain4j wired as bridge `ChatModelApi` for all provider discriminators	done
B	Native `AnthropicMessagesChatModelApi` (DIRECT backend)	done
C	Native `OpenAiChatCompletionsChatModelApi`	done
D	Native `OpenAiResponsesChatModelApi` + `apiFamily` switch on `openai` discriminator + element template bump 10 → 11	done
E1	Capability matrix loaded as Spring Boot config via low-precedence `PropertySource`; bundled `model-capabilities.yaml`; 4-step resolution chain (override → exact id/alias → glob pattern → conservative defaults)	done
E2	Each native impl consumes a `ModelCapabilities` resolved at factory time via `ModelCapabilitiesResolver`	done
E3+E4	`ToolCallResultStrategy` (single-pass per-document routing) + native multimodal emission (images + PDFs) in all three native impls; `AgentMessagesHandlerImpl` no longer extracts documents; `DocumentModality` MIME mapping (`Modality.PDF` → `Modality.DOCUMENT`)	done
F	`ProviderConfiguration` restructure by wire format: Anthropic gains sealed `AnthropicAuthentication` + `AnthropicBackend { DIRECT, BEDROCK, VERTEX, FOUNDRY }`; OpenAI consolidates the three legacy configs (`openai`/`azureOpenAi`/`openaiCompatible`) into one with `OpenAiBackend { OPENAI, FOUNDRY, CUSTOM }`; Google renamed `googleVertexAi` → `googleGenAi` with `GoogleBackend { DEVELOPER_API, VERTEX }`; Bedrock validates non-Anthropic model IDs. `ProviderConfigurationDeserializer` wired via type-level `@JsonDeserialize` (not via Jackson `Module` — the connector runtime's `@ConnectorsObjectMapper` doesn't do module discovery). Element template bumped 11 → 12 with backend dropdowns + conditional auth groups	done
G	Native impls for Anthropic cloud backends (BEDROCK / VERTEX / FOUNDRY), OpenAI FOUNDRY with clientCredentials auth, native Google GenAI, native Bedrock-Converse (non-Anthropic models)	not done — native `AnthropicMessagesChatModelApiFactory` rejects any backend != DIRECT; native `OpenAiChatModelApiFactory` throws a "Phase G Azure SDK integration" placeholder for FOUNDRY + clientCredentials; `bedrock` and `googleGenAi` discriminators still served via L4J bridge
H	Demote L4J bridge to opt-in-only; ADR status → Implemented	not done — L4J is currently default-on (`camunda.connector.agenticai.framework=langchain4j`)

Deferred out of Phase E and tracked separately: reasoning (signed thinking blocks, encrypted reasoning items), prompt caching (cache_control, cachePoint, prompt_cache_key), audio/video modalities, JDK java.net.http.HttpClient adapter (replacing OkHttp transport).

Built on top of #6999 (agentic-ai-document-tool-call-results) — Phase E reuses the document walker, per-handler extraction hooks, XML tag generator, and synthetic-message window-count handling from that PR unchanged.

Known sequencing gotcha for re-landing on main

Phase F's migration deserializer rewrites saved {bedrock, anthropic.claude-…} configurations to {anthropic, backend: bedrock}. The native AnthropicMessagesChatModelApiFactory rejects any backend != DIRECT until Phase G's Anthropic cloud backends ship. Landing F before G on main would therefore break existing customers running Claude on Bedrock via the framework. When breaking F into a follow-up issue, either split it (OpenAI/Google restructure first, Anthropic backend discriminator paired with the Anthropic-cloud-backends issue) or gate the bedrock→anthropic-backend migration row until G is in place.

How we plan to land this

This PR will not be merged. Each phase above (and each deferred topic) is being split into its own issue under a parent tracking issue. Issues will reference the ADR + implementation plan as the design source of truth, and each one is sized to fit a normal review cycle.

The branch stays available as a reference implementation while the follow-up issues are picked up.

Related issues

Parent tracking epic: #7211 — AI Agent: Own the LLM layer. Re-lands the design in three phases (#7212, #7213, #7214) with 19 implementation sub-issues.

Checklist

Backport labels are added if these code changes should be backported.
Not applicable — this PR is the design artefact + reference branch; tests live with the per-phase follow-up PRs.
Documentation: ADR-005 + implementation plan added and kept up to date with the branch state.

maff · 2026-05-07T07:55:35Z

+    @Nullable Integer maxOutputTokens,
+    @Nullable Double temperature,


We need to carefully assess which configurations we expose here as native ones as they vary between providers and models (e.g. newer reasoning models do not support temperature anymore).

High level assessment: https://claude.ai/share/ec1531e3-3e0d-4ba9-8cad-713bb9e970b8

maff · 2026-05-07T08:15:39Z

+ * /v1/responses}) with the corresponding request/response schema.
+ */
+@SlowTest
+public class OpenAiResponsesApiAiAgentJobWorkerTests extends BaseWireFormatAiAgentJobWorkerTest {


Isn't this OpenAiCompletions...?

+
+  ChatStreamListener NOOP = event -> {};
+
+  void onEvent(ChatModelEvent event);


          documentToContentConverter.convert(documentContent.document());
      case ObjectContent objectContent ->
          new dev.langchain4j.data.message.TextContent(convertToString(objectContent.content()));
+      case ReasoningContent ignored ->


maff · 2026-05-07T08:40:24Z

    @JsonInclude(JsonInclude.Include.NON_EMPTY) List<Content> content,
    @JsonInclude(JsonInclude.Include.NON_EMPTY) List<ToolCall> toolCalls,
+    @Nullable String modelId,
+    @Nullable String apiId,


Probably we should lift these metadata fields above the existing content/toolCalls.

messageId or apiMessageId maybe?

…messages Documents in tool call results were serialized as base64 blobs in the tool result text, which models cannot interpret. This change extracts documents from tool call results into a follow-up UserMessage with proper DocumentContent blocks, making them visible to the LLM via the existing document-to-content conversion path (ImageContent, PdfFileContent, TextContent). - Add ToolCallResultDocumentExtractor to recursively find Document instances in tool call result content trees - Create a document UserMessage (tagged with toolCallDocuments metadata) after the ToolCallResultMessage, before event messages - Extract documents from event messages into DocumentContent blocks - Serialize documents in tool results as document references (default DocumentSerializer) instead of base64 content blocks - Remove stale DocumentToContentSerializer/Module/ResponseModel infrastructure - Simplify ToolCallConverterImpl (remove ContentConverter dependency) - Simplify ContentConverterImpl (remove contentObjectMapper copy) - Update e2e tests to assert new message structure ADR: docs/adr/003-document-handling-in-tool-call-results.md

Use Jackson's BeanDescription to walk arbitrary object properties when extracting documents from tool call results. This ensures documents embedded inside records or POJOs (e.g. MCP content types, gateway integrations) are discovered without coupling to specific return types.

…l results Use the raw content map from process variables in gateway tool call results instead of typed domain objects (McpContent, A2aSendMessageResult). This preserves document references as deserialized by the engine, ensuring documents are properly extracted into user messages. The ToolCallResultDocumentExtractor walks Map/Collection/Document trees — typed records are invisible to it. Gateway handlers now pass raw content through and only use typed conversion for metadata extraction. Simplifies the document extractor back to Map/Collection/Document walking since raw maps are sufficient. Adds e2e test verifying MCP image tool results produce document user messages. Updates ADR 003.

…lers Add job worker MCP e2e test verifying image documents in MCP tool call results are extracted into user messages (mirrors outbound connector test). Add A2A unit test verifying raw content with Document instances is preserved through transformation and discoverable by the extractor.

Handle Object[] in the recursive document extraction switch so documents inside arrays are not silently skipped.

Replace plain-text separators with structured XML tags for document correlation in the synthetic UserMessage. Each document gets a <document> tag with tool name, call ID, short document ID, and filename attributes. Event documents also receive XML labels for consistency. A preamble "Documents extracted from tool call results:" introduces the block.

Log a warning when getRawMcpContent finds a non-empty map without the expected 'content' key, which would indicate a response shape change and silent document loss.

Synthetic UserMessages carrying tool call documents (metadata flag toolCallDocuments=true) no longer count toward the context window limit. They are also evicted together with their associated ToolCallResultMessage to prevent orphaned document messages.

The test was passing a typed McpClientCallToolResult as content, but the engine delivers a raw Map. Convert to Map to match real behavior and assert on the extracted content list.

ADR-003 is reserved for the conversation storage SPI introduced in #6784.

Update document message assertions to match the new format: preamble, XML <document> tags with tool/call-id/document-short-id/filename attributes, and interleaved DocumentContent blocks. Extract documentShortId helper into shared AiAgentTestFixtures.

Promote the metadata key constant to a public field on UserMessage so it can be referenced without duplication from AgentMessagesHandlerImpl, MessageWindowRuntimeMemory, and tests.

- Remove unused DownloadFileToolResult records from E2E test base classes (dead code since introduction) - Rename extractDocuments(Object) to extractDocumentsFromContent(Object) to avoid confusing overload with extractDocuments(List<ToolCallResult>)

Use XML escaping for all attribute values in the document XML tag to prevent malformed tags from filenames or tool names containing special characters (&, <, >, ", '). Add Javadoc for documentShortId and documentXmlTag. Add edge case tests for escaping.

Track effective message count as an int instead of recounting via stream on every eviction iteration.

- add JSON-aware ToolExecutionResultMessageEqualsPredicate for order-independent comparison of tool result JSON in E2E tests - lowercase all inline comments introduced in this PR - replace string concatenation with String.formatted() in XML tag assertions - inline E2E message order comments as suffixes on assertions - assert exact base64 data in MCP image E2E tests - restore chat request count assertions in tool calling E2E tests - strengthen MCP handler test to assert on content values, not just list size - add Javadoc example for extractDocumentShortId showing the document reference JSON format

- update document user message format to XML tags with correlation attributes (tool, call-id, document-short-id, filename) - document the message window memory behavior for document messages - document event document labeling consistency - add future optimization note for UserMessage rebuild strategy - update walker to include Object[] support - fix provider list (remove specific provider references) - delete implementation plan file

Only decrement the effective message count when the evicted message is not a tool-call document message, preventing under-counting if an orphaned document message ends up at the eviction position.

Move XML tag building, attribute escaping, and document short ID extraction into a dedicated DocumentXmlTag record with factory methods and toXml() serialization. Tests moved to DocumentXmlTagTest.

@disabled

…ll results Add a manual CPT test that validates real LLM providers can receive and reason about PDF documents extracted from tool call results via the synthetic UserMessage with XML correlation tags. The test covers three scenarios with increasing complexity: - Single document returned from a tool call - Multiple documents returned as a list - Documents embedded in a nested Map structure A BPMN process downloads PDFs from WireMock, then uses FEEL script tasks inside an ad-hoc subprocess to assemble tool results of varying shapes. The AI Agent connector processes these with a real LLM, and CPT judge assertions verify the model correctly extracted facts from the PDFs. Provider configs (toggled via env vars): OpenAI, Anthropic, AWS Bedrock, and OpenAI-compatible (Docker Model Runner). The test is @disabled by default and not part of CI.

Add Ollama provider configs (qwen3.5, llama3.1:8b) with OLLAMA_URL env var. Add .disabled() toggle on ProviderConfig and a modelFilters allowlist for quickly focusing test runs on specific models without commenting code.

…allResultsIT

Add backend selection dropdowns to the AI Agent element template for all three configurable providers: - Anthropic: Direct / AWS Bedrock / Google Vertex AI / Azure AI Foundry - Google GenAI: Vertex AI / Developer API (Google AI Studio) - OpenAI: OpenAI / Azure AI Foundry / Custom Version bumped from 11 to 12 in @ElementTemplate; v11 templates backed up to element-templates/versioned/. README updated to reflect v12 as current.

Phase E3+E4 (ToolCallResultStrategy + native multimodal emission) and Phase F (provider configuration restructure + Jackson migration deserializer + element template v11→v12) have both landed on this branch since the last plan revision. Update the Actual state section, mark the affected per-phase headers as done, refresh the Critical files table, and note the one wiring deviation in Phase F (type-level @JsonDeserialize instead of a Jackson Module, because the connector runtime's @ConnectorsObjectMapper does not do Module bean discovery). ADR status gains a prototype note pointing at this PR while staying Proposed; the design is being re-landed on main as smaller follow-up issues, and Status will move to Implemented once Phase G + Phase H land there.

maff commented May 7, 2026

View reviewed changes

maff assigned maff and nikonovd May 7, 2026

maff commented May 7, 2026

View reviewed changes

github-advanced-security AI found potential problems May 7, 2026

View reviewed changes

maff commented May 7, 2026

View reviewed changes

maff force-pushed the agentic-ai/custom-llm-layer branch from d24acee to 937884d Compare May 7, 2026 09:12

maff added 23 commits May 7, 2026 13:38

feat(agentic-ai): add array support to document extractor walker

b18c43c

Handle Object[] in the recursive document extraction switch so documents inside arrays are not silently skipped.

fix(agentic-ai): warn when MCP content key is missing in raw response

ab0c6ea

Log a warning when getRawMcpContent finds a non-empty map without the expected 'content' key, which would indicate a response shape change and silent document loss.

test(agentic-ai): fix MCP handler test to use raw Map content

9d59267

The test was passing a typed McpClientCallToolResult as content, but the engine delivers a raw Map. Convert to Map to match real behavior and assert on the extracted content list.

docs(agentic-ai): rename ADR-003 to ADR-004

ff95b5a

ADR-003 is reserved for the conversation storage SPI introduced in #6784.

refactor(agentic-ai): move METADATA_TOOL_CALL_DOCUMENTS to UserMessage

d79fbf6

Promote the metadata key constant to a public field on UserMessage so it can be referenced without duplication from AgentMessagesHandlerImpl, MessageWindowRuntimeMemory, and tests.

perf(agentic-ai): avoid O(n^2) message counting in window eviction

4545c8c

Track effective message count as an int instead of recounting via stream on every eviction iteration.

fix(agentic-ai): guard effectiveCount decrement for document messages

0d28ff0

Only decrement the effective message count when the evicted message is not a tool-call document message, preventing under-counting if an orphaned document message ends up at the eviction position.

refactor(agentic-ai): extract DocumentXmlTag record from handler

65b2ae1

Move XML tag building, attribute escaping, and document short ID extraction into a dedicated DocumentXmlTag record with factory methods and toXml() serialization. Tests moved to DocumentXmlTagTest.

fix(agentic-ai): remove duplicate jackson-datatype-document dependency

f4bd520

refactor(e2e): rename DocumentToolCallResultsCPTTest to DocumentToolC…

2a04d09

…allResultsIT

nikonovd added 2 commits May 8, 2026 11:51

feat(agentic-ai): Implement provider configuration (Phase F)

77b5c91

github-advanced-security AI found potential problems May 8, 2026

View reviewed changes

maff force-pushed the agentic-ai-document-tool-call-results branch 3 times, most recently from 76f3677 to 16ab8d3 Compare May 20, 2026 15:08

ztefanie added the agentic-ai label May 21, 2026

maff force-pushed the agentic-ai-document-tool-call-results branch from 16ab8d3 to 51e4d96 Compare May 29, 2026 07:36

Base automatically changed from agentic-ai-document-tool-call-results to main May 29, 2026 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agentic-ai): custom llm layer (HACKDAYS)#7151

feat(agentic-ai): custom llm layer (HACKDAYS)#7151
maff wants to merge 81 commits into
mainfrom
agentic-ai/custom-llm-layer

maff commented May 7, 2026 •

edited

Loading

Uh oh!

maff May 7, 2026

Uh oh!

maff May 7, 2026

Uh oh!

maff May 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		@Nullable Integer maxOutputTokens,
		@Nullable Double temperature,


		ChatStreamListener NOOP = event -> {};

		void onEvent(ChatModelEvent event);

Conversation

maff commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What lives in this PR

What the branch actually implements

Known sequencing gotcha for re-landing on main

How we plan to land this

Related issues

Checklist

Uh oh!

maff May 7, 2026

Choose a reason for hiding this comment

Uh oh!

maff May 7, 2026

Choose a reason for hiding this comment

Uh oh!

maff May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

maff commented May 7, 2026 •

edited

Loading

maff May 7, 2026 •

edited

Loading