Skip to content

Latest commit

 

History

History
55 lines (43 loc) · 2.33 KB

File metadata and controls

55 lines (43 loc) · 2.33 KB

Testing Guide

1. Prerequisites

1.1 Access to DIAL Core

Create the .env file to store environment variables for test runs

  • Fill these environment variables with correct placeholder values:
    • REMOTE_DIAL_URL=<URL of DIAL CORE>
    • REMOTE_DIAL_API_KEY=<Your API-KEY>
    • DIAL_URL=<URL of DIAL CORE>
    • DIAL_API_KEY=<Your API-KEY>

PY Interpreter tests

  • PY_INTERPRETER_LOCAL_RUN=true
  • PY_INTERPRETER_API_KEY=<Your API-KEY>
  • PY_INTERPRETER_URL=<URL of DIAL CORE>

Additional logs if logs are not sufficient

  • QUICKAPP_LOG_LEVEL=DEBUG

2. Model differences and required changes

  • Different DIAL Core instances may expose different models. Adjust tests accordingly.

Changes required:

  • e2e tests:
    • Edit test_e2e.py to add or remove tests that reference missing models.
  • integration tests:
    • Set REFRESH=TRUE to build the cache of model responses during first run of integration tests.
    • Update the agent/orchestrator model list in src/tests/test_runner/cache/cache_middleware.py (the AGENT_MODELS list) to include or remove models available on your instance.

Notes:

  • Building the cache (with REFRESH=TRUE) saves real model responses for later deterministic integration runs.
  • Keep the cached responses committed if you intend to share reproducible integration tests.

3. Execute tests

  • Run end-to-end tests: make e2e_test
  • Run integration tests: make integration_test

Test types (brief)

  • e2e:
    • Simple, "happy path" test(s) validating that all components integrate and function end-to-end.
  • integration:
    • Multiple scenarios using cached model responses of actual tool calls; validates orchestrator behavior only.
    • May fail due to agent response entropy. When an integration test fails, review the failing test and cached model response individually.

Troubleshooting

  • If tests fail because of missing models:
    • Update test_e2e.py and AGENT_MODELS in cache_middleware.py.
  • If integration tests fail nondeterministically:
    • Rebuild cache with REFRESH=TRUE, inspect stored responses, and decide if they should be refreshed/committed.