QA/Automation: Expand CPT tests — Document extraction in tool call results (agentic-ai)

## Implementation Spec (Agent-Ready)

### GOAL

Create a new CPT E2E test class that validates the document extraction feature from [PR #6999](https://github.com/camunda/connectors/pull/6999). Tests must mimic real user flows with a real LLM (gpt-5-nano), asserting that documents returned by tools are correctly delivered to the model and reasoned about.

**Base branch:** `agentic-ai-document-tool-call-results` (PR #6999's branch — tests depend on that code)

---

### 1. Test Class Setup

Create: `connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/java/io/camunda/connector/e2e/agenticai/e2e/AiAgentE2EDocumentTestIT.java`

> ⚠️ Class name MUST end with `E2ETestIT` — the Maven failsafe plugin in the `it-real-llm` profile only runs `**/*E2ETestIT.java`.

```java
@SpringBootTest(classes = AiAgentE2ETestApplication.class)
@CamundaSpringProcessTest
@ActiveProfiles("it-real-llm")
@EnabledIfEnvironmentVariable(named = "OPENAI_API_KEY", matches = ".+")
public class AiAgentE2EDocumentTestIT {

    @Autowired private CamundaClient camundaClient;
    @Autowired private CamundaProcessTestContext processTestContext;

    @BeforeAll
    static void setUp() {
        setAssertionTimeout(Duration.ofMinutes(3));
    }

    @BeforeEach
    void mockDocumentTools() {
        // HTTP connector is DISABLED in the Docker bundle (CONNECTOR_OUTBOUND_DISABLED=io.camunda:http-json:1)
        // So HTTP tool jobs stay open → mock workers pick them up here
        processTestContext
            .mockJobWorker("io.camunda:http-json:1")
            .withHandler((jobClient, job) -> {
                // Route by element ID, return Documents for doc-related tools
            });
    }
}
```

---

### 2. BPMN & Process

Use the existing `ai-agent-e2e-openai.bpmn` as base. Dynamically add document-returning tools using `BpmnUtil.updateInputMappings(...)` or by programmatically adding script tasks to the ad-hoc subprocess.

Alternatively, create a **single** new BPMN file (`ai-agent-e2e-document.bpmn`) following [`document-tool-call-results.bpmn`](https://github.com/camunda/connectors/blob/agentic-ai-document-tool-call-results/connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/resources/document-tool-call-results.bpmn) as reference. Key elements:

- An HTTP service task with `storeResponse=true` that downloads a file (job will be mocked)
- Script tasks inside the AI Agent AHSP that return `toolCallResult` as a Document or nested structure
- The AI Agent AHSP must have the same element template config as in `ai-agent-e2e-openai.bpmn` (OpenAI provider, `gpt-5-nano`)
- Deploy `ai-agent-chat-user-feedback.form` alongside the BPMN for user task completion

---

### 3. Document Delivery via Mock Workers

The HTTP connector is disabled in Docker. Mock workers intercept the jobs. To return a **real Camunda Document**:

```java
// Read PDF from test resources
var pdfBytes = getClass().getResourceAsStream("/document-tool-call-results/project-launch.pdf").readAllBytes();

// Create a Camunda Document via the client API
var documentRef = camundaClient.newCreateDocumentCommand()
    .content(new ByteArrayInputStream(pdfBytes))
    .fileName("project-launch.pdf")
    .contentType("application/pdf")
    .send()
    .join();

// Return as toolCallResult
jobClient.newCompleteCommand(job)
    .variable("toolCallResult", documentRef)
    .send().join();
```

**If `camundaClient.newCreateDocumentCommand()` is not available or the agent doesn't recognize the result as a Document:**
- Fallback: enable the HTTP connector for these tests, serve PDFs via WireMock accessible at `host.testcontainers.internal:<port>`, and let the real connector download with `storeResponse=true`.
- Document this decision in a code comment.

---

### 4. PDF Fixtures

Place in `src/test/resources/document-tool-call-results/`:
- `project-launch.pdf` — unique fact: "Project Zypherion launched on March 15, 2026"
- `headcount-report.pdf` — unique fact: "847 employees across 12 offices"
- `author-info.pdf` — unique fact: "Dr. Kael Thrennix, Chief Analytics Officer"

These can be reused from PR #6999's branch or recreated as simple single-page PDFs.

---

### 5. Test Scenarios

| # | Method Name | Prompt | Mock Returns | Judge Assertion |
|---|-------------|--------|--------------|-----------------|
| 1 | `shouldReasonAboutSingleDocument` | "Download the project report and tell me the project name and launch date" | 1 PDF (project-launch) | responseText contains "Project Zypherion" AND "March 15, 2026" |
| 2 | `shouldReasonAboutMultipleDocuments` | "Search for all available reports and summarize each one" | 2 PDFs (project-launch + headcount-report) | responseText mentions BOTH "Project Zypherion" AND "847 employees" |
| 3 | `shouldReasonAboutNestedDocumentStructure` | "Fetch the full report with all attachments and cover page, describe everything" | Nested: `{summary: "Q1", attachments: [doc1, doc2], metadata: {cover: doc3}}` | responseText references all 3 facts (Zypherion, 847 employees, Dr. Kael Thrennix) |
| 4 | `shouldHandleExternalDocumentReference` | "Get the externally referenced specification document" | `{"camunda.document.type": "external", "url": "...", "name": "Spec Sheet"}` | responseText mentions content from external doc |
| 5 | `shouldHandleTextAndDocumentMix` | "Get the current time AND download the project report" | GetDateAndTime=`now()`, Download=PDF | responseText includes time value AND "Project Zypherion" |
| 6 | **Negative:** `shouldCompleteWithoutDocumentMessageWhenNoDocumentsReturned` | "What is the current date and time in Berlin?" | `now()` (no document) | Process completes; responseText has time; NO document-related errors |
| 7 | **Negative:** `shouldHandleBrokenDocumentReference` | "Download the corrupted report" | Return a map with `camunda.document.type` but invalid/missing binary | Process raises incident OR agent reports inability gracefully |

---

### 6. User Task Completion Pattern

```java
// Wait for user task
assertThatProcessInstance(processInstance).hasActiveElements("User_Feedback");

// Find and complete
var tasks = camundaClient.newUserTaskSearchRequest()
    .filter(f -> f.processInstanceKey(processInstance.getProcessInstanceKey()))
    .send().join();
long taskKey = tasks.items().getFirst().getUserTaskKey();

camundaClient.newCompleteUserTaskCommand(taskKey)
    .variables(Map.of("userSatisfied", true))
    .send().join();

// Assert completion
assertThatProcessInstance(processInstance).isCompleted();
```

---

### 7. Judge Prompt Requirements

Judge prompts must require **content correctness** (facts only obtainable from the PDF):

```java
assertThatProcessInstance(processInstance)
    .hasVariableSatisfiesJudge("agent",
        "The agent variable contains a responseText that mentions 'Project Zypherion' "
        + "and the specific date 'March 15, 2026', proving it read and reasoned about "
        + "the PDF document content provided via the tool call");
```

> Note: `<doc />` structural correctness cannot be asserted via judge (judge only sees `agent.responseText`). For structural assertions, use a programmatic check on the `agent.context` variable if it contains the conversation, or trust the unit/integration tests from PR #6999 which already cover message structure.

---

### 8. Cleanup

```java
@BeforeEach
void clearDocumentStore() {
    InMemoryDocumentStore.INSTANCE.clear();
}
```

---

### 9. Verification: How to Validate This Implementation Works

After writing the tests, the agent MUST verify:

- [ ] **Compile check**: `mvn compile -pl connectors-e2e-test/connectors-e2e-test-agentic-ai -am` passes
- [ ] **Unit test run** (without real LLM): `mvn test -pl connectors-e2e-test/connectors-e2e-test-agentic-ai` — tests should be SKIPPED (not failed) due to missing `OPENAI_API_KEY`
- [ ] **Integration test run** (with key): `OPENAI_API_KEY=<key> mvn verify -pl connectors-e2e-test/connectors-e2e-test-agentic-ai -Pit-real-llm` — all document tests pass
- [ ] **Naming check**: class ends with `E2ETestIT` and is in the `io.camunda.connector.e2e.agenticai.e2e` package
- [ ] **No BPMN duplication**: only one new BPMN file max, or dynamic patching used
- [ ] **No external network dependency**: tests use mocked tools or localhost WireMock only
- [ ] **Judge prompts are specific**: each requires facts unique to the test PDF, not generic statements

---

### Configuration

- **Model**: `gpt-5-nano` (set in `application-it-real-llm.yml`)
- **CI job**: `run-ai-agent-cpt` (path filter already covers `connectors-e2e-test/connectors-e2e-test-agentic-ai/**`)
- **Gating**: `@EnabledIfEnvironmentVariable(named = "OPENAI_API_KEY", matches = ".+")`

---

### References

| Resource | Link |
|----------|------|
| Parent ticket | [#7349](https://github.com/camunda/connectors/issues/7349) |
| Feature PR | [#6999](https://github.com/camunda/connectors/pull/6999) |
| CPT infra PR | [#6940](https://github.com/camunda/connectors/pull/6940) |
| Existing E2E test (pattern to follow) | [`AiAgentE2ETestIT.java`](https://github.com/camunda/connectors/blob/main/connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/java/io/camunda/connector/e2e/agenticai/e2e/AiAgentE2ETestIT.java) |
| Document viability test (reference) | [`DocumentToolCallResultsIT.java`](https://github.com/camunda/connectors/blob/agentic-ai-document-tool-call-results/connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/java/io/camunda/connector/e2e/agenticai/e2e/DocumentToolCallResultsIT.java) |
| Document BPMN (reference) | [`document-tool-call-results.bpmn`](https://github.com/camunda/connectors/blob/agentic-ai-document-tool-call-results/connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/resources/document-tool-call-results.bpmn) |
| BpmnUtil helper | [`BpmnUtil.java`](https://github.com/camunda/connectors/blob/main/connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/java/io/camunda/connector/e2e/agenticai/BpmnUtil.java) |
| CPT config | [`application-it-real-llm.yml`](https://github.com/camunda/connectors/blob/main/connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/resources/application-it-real-llm.yml) |
| ADR | `connectors/agentic-ai/docs/adr/004-document-handling-in-tool-call-results.md` |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QA/Automation: Expand CPT tests — Document extraction in tool call results (agentic-ai) #7350

Implementation Spec (Agent-Ready)

GOAL

1. Test Class Setup

2. BPMN & Process

3. Document Delivery via Mock Workers

4. PDF Fixtures

5. Test Scenarios

6. User Task Completion Pattern

7. Judge Prompt Requirements

8. Cleanup

9. Verification: How to Validate This Implementation Works

Configuration

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

#	Method Name	Prompt	Mock Returns	Judge Assertion
1	`shouldReasonAboutSingleDocument`	"Download the project report and tell me the project name and launch date"	1 PDF (project-launch)	responseText contains "Project Zypherion" AND "March 15, 2026"
2	`shouldReasonAboutMultipleDocuments`	"Search for all available reports and summarize each one"	2 PDFs (project-launch + headcount-report)	responseText mentions BOTH "Project Zypherion" AND "847 employees"
3	`shouldReasonAboutNestedDocumentStructure`	"Fetch the full report with all attachments and cover page, describe everything"	Nested: `{summary: "Q1", attachments: [doc1, doc2], metadata: {cover: doc3}}`	responseText references all 3 facts (Zypherion, 847 employees, Dr. Kael Thrennix)
4	`shouldHandleExternalDocumentReference`	"Get the externally referenced specification document"	`{"camunda.document.type": "external", "url": "...", "name": "Spec Sheet"}`	responseText mentions content from external doc
5	`shouldHandleTextAndDocumentMix`	"Get the current time AND download the project report"	GetDateAndTime=`now()`, Download=PDF	responseText includes time value AND "Project Zypherion"
6	Negative: `shouldCompleteWithoutDocumentMessageWhenNoDocumentsReturned`	"What is the current date and time in Berlin?"	`now()` (no document)	Process completes; responseText has time; NO document-related errors
7	Negative: `shouldHandleBrokenDocumentReference`	"Download the corrupted report"	Return a map with `camunda.document.type` but invalid/missing binary	Process raises incident OR agent reports inability gracefully

Resource	Link
Parent ticket	#7349
Feature PR	#6999
CPT infra PR	#6940
Existing E2E test (pattern to follow)	`AiAgentE2ETestIT.java`
Document viability test (reference)	`DocumentToolCallResultsIT.java`
Document BPMN (reference)	`document-tool-call-results.bpmn`
BpmnUtil helper	`BpmnUtil.java`
CPT config	`application-it-real-llm.yml`
ADR	`connectors/agentic-ai/docs/adr/004-document-handling-in-tool-call-results.md`

QA/Automation: Expand CPT tests — Document extraction in tool call results (agentic-ai) #7350

Description

Implementation Spec (Agent-Ready)

GOAL

1. Test Class Setup

2. BPMN & Process

3. Document Delivery via Mock Workers

4. PDF Fixtures

5. Test Scenarios

6. User Task Completion Pattern

7. Judge Prompt Requirements

8. Cleanup

9. Verification: How to Validate This Implementation Works

Configuration

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions