Skip to content

QA/Automation: Expand CPT tests — Document extraction in tool call results (agentic-ai) #7350

@gbetances089

Description

@gbetances089

Implementation Spec (Agent-Ready)

GOAL

Create a new CPT E2E test class that validates the document extraction feature from PR #6999. Tests must mimic real user flows with a real LLM (gpt-5-nano), asserting that documents returned by tools are correctly delivered to the model and reasoned about.

Base branch: agentic-ai-document-tool-call-results (PR #6999's branch — tests depend on that code)


1. Test Class Setup

Create: connectors-e2e-test/connectors-e2e-test-agentic-ai/src/test/java/io/camunda/connector/e2e/agenticai/e2e/AiAgentE2EDocumentTestIT.java

⚠️ Class name MUST end with E2ETestIT — the Maven failsafe plugin in the it-real-llm profile only runs **/*E2ETestIT.java.

@SpringBootTest(classes = AiAgentE2ETestApplication.class)
@CamundaSpringProcessTest
@ActiveProfiles("it-real-llm")
@EnabledIfEnvironmentVariable(named = "OPENAI_API_KEY", matches = ".+")
public class AiAgentE2EDocumentTestIT {

    @Autowired private CamundaClient camundaClient;
    @Autowired private CamundaProcessTestContext processTestContext;

    @BeforeAll
    static void setUp() {
        setAssertionTimeout(Duration.ofMinutes(3));
    }

    @BeforeEach
    void mockDocumentTools() {
        // HTTP connector is DISABLED in the Docker bundle (CONNECTOR_OUTBOUND_DISABLED=io.camunda:http-json:1)
        // So HTTP tool jobs stay open → mock workers pick them up here
        processTestContext
            .mockJobWorker("io.camunda:http-json:1")
            .withHandler((jobClient, job) -> {
                // Route by element ID, return Documents for doc-related tools
            });
    }
}

2. BPMN & Process

Use the existing ai-agent-e2e-openai.bpmn as base. Dynamically add document-returning tools using BpmnUtil.updateInputMappings(...) or by programmatically adding script tasks to the ad-hoc subprocess.

Alternatively, create a single new BPMN file (ai-agent-e2e-document.bpmn) following document-tool-call-results.bpmn as reference. Key elements:

  • An HTTP service task with storeResponse=true that downloads a file (job will be mocked)
  • Script tasks inside the AI Agent AHSP that return toolCallResult as a Document or nested structure
  • The AI Agent AHSP must have the same element template config as in ai-agent-e2e-openai.bpmn (OpenAI provider, gpt-5-nano)
  • Deploy ai-agent-chat-user-feedback.form alongside the BPMN for user task completion

3. Document Delivery via Mock Workers

The HTTP connector is disabled in Docker. Mock workers intercept the jobs. To return a real Camunda Document:

// Read PDF from test resources
var pdfBytes = getClass().getResourceAsStream("/document-tool-call-results/project-launch.pdf").readAllBytes();

// Create a Camunda Document via the client API
var documentRef = camundaClient.newCreateDocumentCommand()
    .content(new ByteArrayInputStream(pdfBytes))
    .fileName("project-launch.pdf")
    .contentType("application/pdf")
    .send()
    .join();

// Return as toolCallResult
jobClient.newCompleteCommand(job)
    .variable("toolCallResult", documentRef)
    .send().join();

If camundaClient.newCreateDocumentCommand() is not available or the agent doesn't recognize the result as a Document:

  • Fallback: enable the HTTP connector for these tests, serve PDFs via WireMock accessible at host.testcontainers.internal:<port>, and let the real connector download with storeResponse=true.
  • Document this decision in a code comment.

4. PDF Fixtures

Place in src/test/resources/document-tool-call-results/:

  • project-launch.pdf — unique fact: "Project Zypherion launched on March 15, 2026"
  • headcount-report.pdf — unique fact: "847 employees across 12 offices"
  • author-info.pdf — unique fact: "Dr. Kael Thrennix, Chief Analytics Officer"

These can be reused from PR #6999's branch or recreated as simple single-page PDFs.


5. Test Scenarios

# Method Name Prompt Mock Returns Judge Assertion
1 shouldReasonAboutSingleDocument "Download the project report and tell me the project name and launch date" 1 PDF (project-launch) responseText contains "Project Zypherion" AND "March 15, 2026"
2 shouldReasonAboutMultipleDocuments "Search for all available reports and summarize each one" 2 PDFs (project-launch + headcount-report) responseText mentions BOTH "Project Zypherion" AND "847 employees"
3 shouldReasonAboutNestedDocumentStructure "Fetch the full report with all attachments and cover page, describe everything" Nested: {summary: "Q1", attachments: [doc1, doc2], metadata: {cover: doc3}} responseText references all 3 facts (Zypherion, 847 employees, Dr. Kael Thrennix)
4 shouldHandleExternalDocumentReference "Get the externally referenced specification document" {"camunda.document.type": "external", "url": "...", "name": "Spec Sheet"} responseText mentions content from external doc
5 shouldHandleTextAndDocumentMix "Get the current time AND download the project report" GetDateAndTime=now(), Download=PDF responseText includes time value AND "Project Zypherion"
6 Negative: shouldCompleteWithoutDocumentMessageWhenNoDocumentsReturned "What is the current date and time in Berlin?" now() (no document) Process completes; responseText has time; NO document-related errors
7 Negative: shouldHandleBrokenDocumentReference "Download the corrupted report" Return a map with camunda.document.type but invalid/missing binary Process raises incident OR agent reports inability gracefully

6. User Task Completion Pattern

// Wait for user task
assertThatProcessInstance(processInstance).hasActiveElements("User_Feedback");

// Find and complete
var tasks = camundaClient.newUserTaskSearchRequest()
    .filter(f -> f.processInstanceKey(processInstance.getProcessInstanceKey()))
    .send().join();
long taskKey = tasks.items().getFirst().getUserTaskKey();

camundaClient.newCompleteUserTaskCommand(taskKey)
    .variables(Map.of("userSatisfied", true))
    .send().join();

// Assert completion
assertThatProcessInstance(processInstance).isCompleted();

7. Judge Prompt Requirements

Judge prompts must require content correctness (facts only obtainable from the PDF):

assertThatProcessInstance(processInstance)
    .hasVariableSatisfiesJudge("agent",
        "The agent variable contains a responseText that mentions 'Project Zypherion' "
        + "and the specific date 'March 15, 2026', proving it read and reasoned about "
        + "the PDF document content provided via the tool call");

Note: <doc /> structural correctness cannot be asserted via judge (judge only sees agent.responseText). For structural assertions, use a programmatic check on the agent.context variable if it contains the conversation, or trust the unit/integration tests from PR #6999 which already cover message structure.


8. Cleanup

@BeforeEach
void clearDocumentStore() {
    InMemoryDocumentStore.INSTANCE.clear();
}

9. Verification: How to Validate This Implementation Works

After writing the tests, the agent MUST verify:

  • Compile check: mvn compile -pl connectors-e2e-test/connectors-e2e-test-agentic-ai -am passes
  • Unit test run (without real LLM): mvn test -pl connectors-e2e-test/connectors-e2e-test-agentic-ai — tests should be SKIPPED (not failed) due to missing OPENAI_API_KEY
  • Integration test run (with key): OPENAI_API_KEY=<key> mvn verify -pl connectors-e2e-test/connectors-e2e-test-agentic-ai -Pit-real-llm — all document tests pass
  • Naming check: class ends with E2ETestIT and is in the io.camunda.connector.e2e.agenticai.e2e package
  • No BPMN duplication: only one new BPMN file max, or dynamic patching used
  • No external network dependency: tests use mocked tools or localhost WireMock only
  • Judge prompts are specific: each requires facts unique to the test PDF, not generic statements

Configuration

  • Model: gpt-5-nano (set in application-it-real-llm.yml)
  • CI job: run-ai-agent-cpt (path filter already covers connectors-e2e-test/connectors-e2e-test-agentic-ai/**)
  • Gating: @EnabledIfEnvironmentVariable(named = "OPENAI_API_KEY", matches = ".+")

References

Resource Link
Parent ticket #7349
Feature PR #6999
CPT infra PR #6940
Existing E2E test (pattern to follow) AiAgentE2ETestIT.java
Document viability test (reference) DocumentToolCallResultsIT.java
Document BPMN (reference) document-tool-call-results.bpmn
BpmnUtil helper BpmnUtil.java
CPT config application-it-real-llm.yml
ADR connectors/agentic-ai/docs/adr/004-document-handling-in-tool-call-results.md

Metadata

Metadata

Labels

agentic-aicomponent:qaTask containing all details related to QAqa:requiredWill trigger the QA workflow

Type

Urgency

None yet

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions