Skip to content

feat(agentic-ai): custom llm layer (HACKDAYS)#7151

Draft
maff wants to merge 81 commits into
mainfrom
agentic-ai/custom-llm-layer
Draft

feat(agentic-ai): custom llm layer (HACKDAYS)#7151
maff wants to merge 81 commits into
mainfrom
agentic-ai/custom-llm-layer

Conversation

@maff

@maff maff commented May 7, 2026

Copy link
Copy Markdown
Member

Description

HACKDAYS project — not for merge in its current shape. The deliverables we want to keep from this branch are the ADR + implementation plan; the code is a working reference implementation we will rebuild in small, reviewable chunks via the follow-up issues linked below.

This branch explores replacing the LangChain4j-backed AiFrameworkAdapter in agentic-ai with a native provider layer built on the official vendor Java SDKs (Anthropic, OpenAI, AWS Bedrock, Google GenAI), and adds a capability matrix + tool-call-result routing strategy so the agent can use provider-native multimodal features instead of the lowest-common-denominator surface today's framework offers.

What lives in this PR

  • ADR-005connectors/agentic-ai/docs/adr/005-replace-langchain4j-framework.md. Captures context, decision drivers, options considered, final architecture (ChatClient facade → ChatModelApi per wire protocol → vendor SDKs), domain-model extensions, capability matrix design, tool-call-result routing strategy, reasoning + caching design, provider configuration restructure with Jackson migration, and phased migration plan.
  • Implementation planconnectors/agentic-ai/docs/adr-005-implementation-plan.md. Breaks Phase 1 of the ADR into eight green-build checkpoints (Phases A–H), with file lists, test plans, and verification commands per phase. Marks each phase as done or open against this branch.
  • Reference implementation of the phases listed below. Useful as a worked example when building out the follow-up tickets, but not the form we want to land on main.

What the branch actually implements

Following the phase numbering in the implementation plan:

Phase Scope State on this branch
0 Domain model extensions (ReasoningContent, AssistantMessage.{modelId, messageId, stopReason, usage}, ToolCallResult.contentBlocks, TokenUsage.{cacheRead, cacheCreation, reasoning}Tokens, drop rawChatResponse()) done
A New SPI under framework/api/ (ChatClient, ChatModelApi, ChatModelApiFactory, ChatModelApiRegistry, ChatRequest, ChatResponse, ChatOptions, ModelCapabilities, ChatStreamListener, ChatModelEvent); BaseAgentRequestHandler routes through ChatClient; LangChain4j wired as bridge ChatModelApi for all provider discriminators done
B Native AnthropicMessagesChatModelApi (DIRECT backend) done
C Native OpenAiChatCompletionsChatModelApi done
D Native OpenAiResponsesChatModelApi + apiFamily switch on openai discriminator + element template bump 10 → 11 done
E1 Capability matrix loaded as Spring Boot config via low-precedence PropertySource; bundled model-capabilities.yaml; 4-step resolution chain (override → exact id/alias → glob pattern → conservative defaults) done
E2 Each native impl consumes a ModelCapabilities resolved at factory time via ModelCapabilitiesResolver done
E3+E4 ToolCallResultStrategy (single-pass per-document routing) + native multimodal emission (images + PDFs) in all three native impls; AgentMessagesHandlerImpl no longer extracts documents; DocumentModality MIME mapping (Modality.PDFModality.DOCUMENT) done
F ProviderConfiguration restructure by wire format: Anthropic gains sealed AnthropicAuthentication + AnthropicBackend { DIRECT, BEDROCK, VERTEX, FOUNDRY }; OpenAI consolidates the three legacy configs (openai/azureOpenAi/openaiCompatible) into one with OpenAiBackend { OPENAI, FOUNDRY, CUSTOM }; Google renamed googleVertexAigoogleGenAi with GoogleBackend { DEVELOPER_API, VERTEX }; Bedrock validates non-Anthropic model IDs. ProviderConfigurationDeserializer wired via type-level @JsonDeserialize (not via Jackson Module — the connector runtime's @ConnectorsObjectMapper doesn't do module discovery). Element template bumped 11 → 12 with backend dropdowns + conditional auth groups done
G Native impls for Anthropic cloud backends (BEDROCK / VERTEX / FOUNDRY), OpenAI FOUNDRY with clientCredentials auth, native Google GenAI, native Bedrock-Converse (non-Anthropic models) not done — native AnthropicMessagesChatModelApiFactory rejects any backend != DIRECT; native OpenAiChatModelApiFactory throws a "Phase G Azure SDK integration" placeholder for FOUNDRY + clientCredentials; bedrock and googleGenAi discriminators still served via L4J bridge
H Demote L4J bridge to opt-in-only; ADR status → Implemented not done — L4J is currently default-on (camunda.connector.agenticai.framework=langchain4j)

Deferred out of Phase E and tracked separately: reasoning (signed thinking blocks, encrypted reasoning items), prompt caching (cache_control, cachePoint, prompt_cache_key), audio/video modalities, JDK java.net.http.HttpClient adapter (replacing OkHttp transport).

Built on top of #6999 (agentic-ai-document-tool-call-results) — Phase E reuses the document walker, per-handler extraction hooks, XML tag generator, and synthetic-message window-count handling from that PR unchanged.

Known sequencing gotcha for re-landing on main

Phase F's migration deserializer rewrites saved {bedrock, anthropic.claude-…} configurations to {anthropic, backend: bedrock}. The native AnthropicMessagesChatModelApiFactory rejects any backend != DIRECT until Phase G's Anthropic cloud backends ship. Landing F before G on main would therefore break existing customers running Claude on Bedrock via the framework. When breaking F into a follow-up issue, either split it (OpenAI/Google restructure first, Anthropic backend discriminator paired with the Anthropic-cloud-backends issue) or gate the bedrock→anthropic-backend migration row until G is in place.

How we plan to land this

This PR will not be merged. Each phase above (and each deferred topic) is being split into its own issue under a parent tracking issue. Issues will reference the ADR + implementation plan as the design source of truth, and each one is sized to fit a normal review cycle.

The branch stays available as a reference implementation while the follow-up issues are picked up.

Related issues

Parent tracking epic: #7211AI Agent: Own the LLM layer. Re-lands the design in three phases (#7212, #7213, #7214) with 19 implementation sub-issues.

Checklist

  • Backport labels are added if these code changes should be backported.
  • Not applicable — this PR is the design artefact + reference branch; tests live with the per-phase follow-up PRs.
  • Documentation: ADR-005 + implementation plan added and kept up to date with the branch state.

Comment on lines +20 to +21
@Nullable Integer maxOutputTokens,
@Nullable Double temperature,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to carefully assess which configurations we expose here as native ones as they vary between providers and models (e.g. newer reasoning models do not support temperature anymore).

High level assessment: https://claude.ai/share/ec1531e3-3e0d-4ba9-8cad-713bb9e970b8

@maff maff assigned maff and nikonovd May 7, 2026
* /v1/responses}) with the corresponding request/response schema.
*/
@SlowTest
public class OpenAiResponsesApiAiAgentJobWorkerTests extends BaseWireFormatAiAgentJobWorkerTest {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this OpenAiCompletions...?


ChatStreamListener NOOP = event -> {};

void onEvent(ChatModelEvent event);
documentToContentConverter.convert(documentContent.document());
case ObjectContent objectContent ->
new dev.langchain4j.data.message.TextContent(convertToString(objectContent.content()));
case ReasoningContent ignored ->
@JsonInclude(JsonInclude.Include.NON_EMPTY) List<Content> content,
@JsonInclude(JsonInclude.Include.NON_EMPTY) List<ToolCall> toolCalls,
@Nullable String modelId,
@Nullable String apiId,

@maff maff May 7, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we should lift these metadata fields above the existing content/toolCalls.

messageId or apiMessageId maybe?

@maff maff force-pushed the agentic-ai/custom-llm-layer branch from d24acee to 937884d Compare May 7, 2026 09:12
maff added 23 commits May 7, 2026 13:38
…messages

Documents in tool call results were serialized as base64 blobs in the tool
result text, which models cannot interpret. This change extracts documents
from tool call results into a follow-up UserMessage with proper DocumentContent
blocks, making them visible to the LLM via the existing document-to-content
conversion path (ImageContent, PdfFileContent, TextContent).

- Add ToolCallResultDocumentExtractor to recursively find Document instances
  in tool call result content trees
- Create a document UserMessage (tagged with toolCallDocuments metadata) after
  the ToolCallResultMessage, before event messages
- Extract documents from event messages into DocumentContent blocks
- Serialize documents in tool results as document references (default
  DocumentSerializer) instead of base64 content blocks
- Remove stale DocumentToContentSerializer/Module/ResponseModel infrastructure
- Simplify ToolCallConverterImpl (remove ContentConverter dependency)
- Simplify ContentConverterImpl (remove contentObjectMapper copy)
- Update e2e tests to assert new message structure

ADR: docs/adr/003-document-handling-in-tool-call-results.md
Use Jackson's BeanDescription to walk arbitrary object properties when
extracting documents from tool call results. This ensures documents
embedded inside records or POJOs (e.g. MCP content types, gateway
integrations) are discovered without coupling to specific return types.
…l results

Use the raw content map from process variables in gateway tool call
results instead of typed domain objects (McpContent, A2aSendMessageResult).
This preserves document references as deserialized by the engine, ensuring
documents are properly extracted into user messages.

The ToolCallResultDocumentExtractor walks Map/Collection/Document trees —
typed records are invisible to it. Gateway handlers now pass raw content
through and only use typed conversion for metadata extraction.

Simplifies the document extractor back to Map/Collection/Document
walking since raw maps are sufficient. Adds e2e test verifying MCP
image tool results produce document user messages. Updates ADR 003.
…lers

Add job worker MCP e2e test verifying image documents in MCP tool call
results are extracted into user messages (mirrors outbound connector
test). Add A2A unit test verifying raw content with Document instances
is preserved through transformation and discoverable by the extractor.
Handle Object[] in the recursive document extraction switch so
documents inside arrays are not silently skipped.
Replace plain-text separators with structured XML tags for document
correlation in the synthetic UserMessage. Each document gets a
<document> tag with tool name, call ID, short document ID, and
filename attributes. Event documents also receive XML labels for
consistency. A preamble "Documents extracted from tool call results:"
introduces the block.
Log a warning when getRawMcpContent finds a non-empty map without the
expected 'content' key, which would indicate a response shape change
and silent document loss.
Synthetic UserMessages carrying tool call documents (metadata flag
toolCallDocuments=true) no longer count toward the context window
limit. They are also evicted together with their associated
ToolCallResultMessage to prevent orphaned document messages.
The test was passing a typed McpClientCallToolResult as content, but
the engine delivers a raw Map. Convert to Map to match real behavior
and assert on the extracted content list.
ADR-003 is reserved for the conversation storage SPI introduced in
#6784.
Update document message assertions to match the new format: preamble,
XML <document> tags with tool/call-id/document-short-id/filename
attributes, and interleaved DocumentContent blocks. Extract
documentShortId helper into shared AiAgentTestFixtures.
Promote the metadata key constant to a public field on UserMessage so
it can be referenced without duplication from AgentMessagesHandlerImpl,
MessageWindowRuntimeMemory, and tests.
- Remove unused DownloadFileToolResult records from E2E test base
  classes (dead code since introduction)
- Rename extractDocuments(Object) to extractDocumentsFromContent(Object)
  to avoid confusing overload with extractDocuments(List<ToolCallResult>)
Use XML escaping for all attribute values in the document XML tag to
prevent malformed tags from filenames or tool names containing special
characters (&, <, >, ", '). Add Javadoc for documentShortId and
documentXmlTag. Add edge case tests for escaping.
Track effective message count as an int instead of recounting via
stream on every eviction iteration.
- add JSON-aware ToolExecutionResultMessageEqualsPredicate for
  order-independent comparison of tool result JSON in E2E tests
- lowercase all inline comments introduced in this PR
- replace string concatenation with String.formatted() in XML tag
  assertions
- inline E2E message order comments as suffixes on assertions
- assert exact base64 data in MCP image E2E tests
- restore chat request count assertions in tool calling E2E tests
- strengthen MCP handler test to assert on content values, not just
  list size
- add Javadoc example for extractDocumentShortId showing the document
  reference JSON format
- update document user message format to XML tags with correlation
  attributes (tool, call-id, document-short-id, filename)
- document the message window memory behavior for document messages
- document event document labeling consistency
- add future optimization note for UserMessage rebuild strategy
- update walker to include Object[] support
- fix provider list (remove specific provider references)
- delete implementation plan file
Only decrement the effective message count when the evicted message is
not a tool-call document message, preventing under-counting if an
orphaned document message ends up at the eviction position.
Move XML tag building, attribute escaping, and document short ID
extraction into a dedicated DocumentXmlTag record with factory methods
and toXml() serialization. Tests moved to DocumentXmlTagTest.
…ll results

Add a manual CPT test that validates real LLM providers can receive and
reason about PDF documents extracted from tool call results via the
synthetic UserMessage with XML correlation tags.

The test covers three scenarios with increasing complexity:
- Single document returned from a tool call
- Multiple documents returned as a list
- Documents embedded in a nested Map structure

A BPMN process downloads PDFs from WireMock, then uses FEEL script tasks
inside an ad-hoc subprocess to assemble tool results of varying shapes.
The AI Agent connector processes these with a real LLM, and CPT judge
assertions verify the model correctly extracted facts from the PDFs.

Provider configs (toggled via env vars): OpenAI, Anthropic, AWS Bedrock,
and OpenAI-compatible (Docker Model Runner). The test is @disabled by
default and not part of CI.
Add Ollama provider configs (qwen3.5, llama3.1:8b) with OLLAMA_URL env
var. Add .disabled() toggle on ProviderConfig and a modelFilters
allowlist for quickly focusing test runs on specific models without
commenting code.
nikonovd added 2 commits May 8, 2026 11:51
Add backend selection dropdowns to the AI Agent element template for all
three configurable providers:
- Anthropic: Direct / AWS Bedrock / Google Vertex AI / Azure AI Foundry
- Google GenAI: Vertex AI / Developer API (Google AI Studio)
- OpenAI: OpenAI / Azure AI Foundry / Custom

Version bumped from 11 to 12 in @ElementTemplate; v11 templates backed up
to element-templates/versioned/. README updated to reflect v12 as current.
Phase E3+E4 (ToolCallResultStrategy + native multimodal emission) and
Phase F (provider configuration restructure + Jackson migration
deserializer + element template v11→v12) have both landed on this
branch since the last plan revision. Update the Actual state section,
mark the affected per-phase headers as done, refresh the Critical
files table, and note the one wiring deviation in Phase F (type-level
@JsonDeserialize instead of a Jackson Module, because the connector
runtime's @ConnectorsObjectMapper does not do Module bean discovery).

ADR status gains a prototype note pointing at this PR while staying
Proposed; the design is being re-landed on main as smaller follow-up
issues, and Status will move to Implemented once Phase G + Phase H
land there.
@maff maff force-pushed the agentic-ai-document-tool-call-results branch 3 times, most recently from 76f3677 to 16ab8d3 Compare May 20, 2026 15:08
@maff maff force-pushed the agentic-ai-document-tool-call-results branch from 16ab8d3 to 51e4d96 Compare May 29, 2026 07:36
Base automatically changed from agentic-ai-document-tool-call-results to main May 29, 2026 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants