### Tasks
- [ ] **Cross-provider Tests:**
- Run (or activate) `DocumentToolCallResultsIT` — currently @Disabled in repo — against these models as available:
- OpenAI (gpt-4.1, gpt-5.4, gpt-4o)
- Anthropic (claude-sonnet-4-6, haiku)
- AWS Bedrock (Anthropic models)
- Local/Dev: Docker Model Runner, Ollama
- For each run, validate that:
- Single PDF: agent mentions project/facts from test document
- Multiple PDFs: agent covers facts for each
- Nested PDFs: agent references all nested facts
- **Files**: PDFs as per `/src/test/resources/document-tool-call-results/*.pdf`. WireMock or public URL must be accessible — adapt as necessary for model container if not.
- [ ] **Conversation Audit:**
- Examine synthetic UserMessages after tool call results; ensure documents appear only once, under the correct tool call, and with appropriate <doc /> tags (XML structure)
- Confirm that after message window exceeds maxMessages, document UMs are evicted together with their corresponding ToolCallResultMessages — no orphaned document content
- Validate that the synthetic UserMessages have the `toolCallDocuments=true` metadata flag set and are correctly identified for eviction logic.
- [ ] **Negative Paths/Edge Cases:**
- No documents present: check that no document message is generated (regression: confirm behavior is identical to pre-PR; no synthetic UserMessage or preamble)
- Document references with missing binaries: agent gets an explicit "broken/missing-doc" response (confirm the output is handled gracefully and matches design)
- Same PDF returned from multiple tools in one turn: should not see duplicated document in outputs (across all tools)
- [ ] **Event flow scenarios:**
- Use a non-interrupting event subprocess that emits a payload containing a document. Confirm the model receives the document and the synthetic UserMessage uses the `"event data"` preamble.
- [ ] **Gateway handler paths:**
- Via MCP and A2A gateway tool integrations, ensure document extraction works as designed through their specific handler logic, not just via BPMN-native tools.
- [ ] **Report/Log All Findings**
- [ ] Open follow-ups for any discrepancies/bugs found
---
**Related:**
- [Parent QA Ticket](https://github.com/camunda/connectors/issues/new?title=QA%3A+Document+handling+in+tool+call+results+%28agentic-ai%29+%E2%80%94+test+plan+and+tracking)
- [PR #6999](https://github.com/camunda/connectors/pull/6999)
- [DocumentToolCallResultsIT Example](https://github.com/camunda/connectors/blob/main/connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/java/io/camunda/connector/e2e/agenticai/e2e/DocumentToolCallResultsIT.java)
Summary
Manually validate the document extraction feature added in PR #6999 across all supported LLM providers and by reviewing conversation history and evictions. The goal is to catch edge cases not covered by CPT automation due to job mocking, container networking, or provider-specific behaviors.