|
4 | 4 |
|
5 | 5 | | File | Key Symbols | Responsibility | |
6 | 6 | |---|---|---| |
7 | | -| `pipeline/evaluation/driver.py` | `AgentDriver`, `HttpApiDriver`, `AgentDriverRegistry` | Driver abstraction, HTTP implementation, driver factory | |
8 | | -| `pipeline/evaluation/amender.py` | `APIDataAmender` | Mutates turn data with agent response, tokens, latency, streaming metrics | |
9 | | -| `core/api/client.py` | `APIClient` | HTTP client with caching, retries, streaming support | |
| 7 | +| `pipeline/evaluation/driver.py` | `AgentDriver`, `HttpApiDriver`, `ProposalDriver`, `TerminalOutcome` | Driver abstraction, HTTP and Proposal implementations | |
| 8 | +| `pipeline/evaluation/registry.py` | `AgentDriverRegistry`, `AGENT_DRIVERS` | Driver type registry and factory | |
| 9 | +| `pipeline/evaluation/amender.py` | `APIDataAmender` | Mutates turn data with HTTP agent response, tokens, latency, streaming metrics | |
| 10 | +| `pipeline/evaluation/proposal_amender.py` | `ProposalAmender` | Fetches child Result CRs, builds Markdown summary, amends proposal turn data | |
| 11 | +| `pipeline/evaluation/cli.py` | `CLIClient`, `KubeCLI` | Abstract CLI interface and Kubernetes (oc/kubectl) implementation | |
| 12 | +| `core/api/client.py` | `APIClient` | HTTP client with caching, retries; supports query/streaming/infer/responses endpoints | |
10 | 13 | | `core/api/streaming_parser.py` | `parse_streaming_response()`, `StreamingContext` | SSE parsing with TTFT/throughput tracking | |
11 | | -| `core/models/agents.py` | `HttpApiAgentConfig`, `AgentsConfig`, `AgentDefaultConfig` | Agent configuration models; `AgentsConfig.resolve_agent_config()` handles config merge | |
| 14 | +| `core/proposal/phase.py` | `derive_phase()` | Proposal phase derivation from CRD conditions | |
| 15 | +| `core/metrics/custom/proposal_eval.py` | `evaluate_proposal_status()` | Proposal status assertion metric | |
| 16 | +| `core/models/agents.py` | `HttpApiAgentConfig`, `ProposalAgentConfig`, `AgentsConfig`, `AgentDefaultConfig` | Agent configuration models; `AgentsConfig.resolve_agent_config()` handles config merge | |
12 | 17 |
|
13 | 18 | ## Data Flow |
14 | 19 |
|
15 | | -1. `EvaluationPipeline._initialize_components()` creates an `AgentDriverRegistry` with registered driver types (default: `http_api` → `HttpApiDriver`). |
16 | | -2. For each conversation, `_resolve_driver_for_conversation()` either reuses the default driver or creates a per-conversation driver if that conversation has agent config overrides. |
17 | | -3. `ConversationProcessor._process_turn_api()` calls `driver.execute_turn(turn_data, conversation_id)` before metrics evaluation. |
18 | | -4. `HttpApiDriver` delegates to `APIDataAmender.amend_single_turn()`, which calls `APIClient.query()`. |
19 | | -5. `APIClient` sends the HTTP request (standard POST, streaming SSE, or RLSAPI /infer depending on endpoint type). |
20 | | -6. `APIDataAmender` mutates `TurnData` in-place: response text, contexts, tool_calls, token counts, agent latency, and streaming metrics (TTFT, duration, throughput). |
21 | | -7. The amended turn data is then passed to `MetricsEvaluator` for scoring. |
| 20 | +### HttpApiDriver Flow |
| 21 | + |
| 22 | +- `EvaluationPipeline._initialize_components()` creates an `AgentDriverRegistry` with registered driver types (`http_api` → HttpApiDriver, `proposal` → ProposalDriver). |
| 23 | +- For each conversation, `_resolve_driver_for_conversation()` either reuses the default driver or creates a per-conversation driver if that conversation has agent config overrides. |
| 24 | +- `ConversationProcessor._process_turn_api()` calls `driver.execute_turn(turn_data, conversation_id)` before metrics evaluation. |
| 25 | +- `HttpApiDriver` delegates to `APIDataAmender.amend_single_turn()`, which calls `APIClient.query()`. |
| 26 | +- `APIClient` sends the HTTP request (standard POST, streaming SSE, RLSAPI /infer, or OpenAI Responses API depending on endpoint type). |
| 27 | +- `APIDataAmender` mutates `TurnData` in-place: response text, contexts, tool_calls, token counts, agent latency, and streaming metrics. |
| 28 | + |
| 29 | +### ProposalDriver Flow |
| 30 | + |
| 31 | +- `ProposalDriver.execute_turn()` builds a Proposal CR manifest from `turn_data.proposal_spec`. |
| 32 | +- `KubeCLI.apply()` creates the Proposal CR in the configured namespace. |
| 33 | +- If `auto_approve` is enabled, the driver polls until Analyzed=True, then creates a ProposalApproval CR. |
| 34 | +- The driver polls `KubeCLI.get_resource()` for the Proposal's status conditions until a terminal outcome is reached (Completed, Failed, Denied, Escalated) or timeout. |
| 35 | +- `derive_phase()` evaluates conditions to determine the current phase, handling retry logic (RetryingExecution reason). |
| 36 | +- `ProposalAmender.amend()` fetches child Result CRs (analysisresults, executionresults, verificationresults, escalationresults) and builds a Markdown summary. |
| 37 | +- Turn data is amended in-place: response (Markdown), proposal_status, proposal_results, proposal_phases. |
| 38 | +- If `cleanup_proposals` is enabled, the Proposal CR is deleted after processing. |
22 | 39 |
|
23 | 40 | ## Key Abstractions |
24 | 41 |
|
25 | | -**AgentDriverRegistry** maps driver type strings to driver classes. Adding a new driver type means: (1) subclass `AgentDriver`, (2) register in the registry's `_driver_types` dict. Currently only `http_api` is registered. |
| 42 | +**AgentDriverRegistry** maps driver type strings to driver classes. Two types registered: `http_api` and `proposal`. Adding a new driver type: subclass `AgentDriver`, add to `AGENT_DRIVERS` dict in `registry.py`. |
| 43 | + |
| 44 | +**AgentDriver** is the abstract interface with `execute_turn()`, `validate_config()`, `enabled`, and `close()`. Returns `(error_message, conversation_id)` tuple. |
26 | 45 |
|
27 | | -**AgentDriver** is the abstract interface with `execute_turn()`, `validate_config()`, `enabled`, and `close()`. The `execute_turn()` method returns a tuple of `(error_message, conversation_id)` — the error message is None on success, and the conversation_id may be updated by the agent (for multi-turn conversation tracking). |
| 46 | +**APIClient** handles four query modes: standard POST (`/query`), streaming SSE, RLSAPI `/infer`, and OpenAI Responses API (`/responses`). Manages disk-based caching and automatic retries on 429/5xx. |
28 | 47 |
|
29 | | -**APIClient** handles three query modes based on endpoint configuration: standard POST (`/query`), streaming SSE, and RLSAPI `/infer`. It manages disk-based caching (keyed by SHA256 of query+model+params) and automatic retries on 429/5xx responses. |
| 48 | +**ProposalAmender** maps CRD step names to resource types (`analysis` → `analysisresults`, etc.), fetches each via KubeCLI, and builds a structured Markdown response with sections for Analysis, Execution, Verification, and Escalation. |
30 | 49 |
|
31 | | -**Config resolution** follows three-tier priority: eval_data agent overrides > named agent config > system defaults. `resolve_agent_config()` merges these layers into the final config dict passed to the driver. |
| 50 | +**CLIClient** abstracts CLI operations (apply, get_resource, delete). `KubeCLI` resolves `oc` or `kubectl` on PATH, runs commands with namespace and JSON output flags. |
32 | 51 |
|
33 | 52 | ## Integration Points |
34 | 53 |
|
35 | 54 | | Consumer | Provider | Mechanism | |
36 | 55 | |---|---|---| |
37 | 56 | | `EvaluationPipeline` | `AgentDriverRegistry` | Creates drivers from config | |
38 | 57 | | `ConversationProcessor` | `AgentDriver.execute_turn()` | Invokes driver per turn | |
39 | | -| `HttpApiDriver` | `APIDataAmender` | Delegates turn amendment | |
40 | | -| `APIDataAmender` | `APIClient` | Sends HTTP requests | |
41 | | -| `APIClient` | `StreamingParser` | Parses SSE responses | |
| 58 | +| `HttpApiDriver` | `APIDataAmender` → `APIClient` | HTTP request chain | |
| 59 | +| `ProposalDriver` | `KubeCLI` | CR lifecycle (apply, get, delete) | |
| 60 | +| `ProposalDriver` | `ProposalAmender` | Fetch child CRs and build summary | |
| 61 | +| `derive_phase()` | CRD conditions | Phase determination logic | |
42 | 62 |
|
43 | 63 | ## Implementation Notes |
44 | 64 |
|
45 | 65 | - **Per-conversation drivers** are created when a conversation has agent config overrides and are cleaned up after that conversation completes. The default driver persists across all conversations. |
46 | 66 | - **Disk caching** in `APIClient` uses `diskcache` with SHA256 keys. Cache can be disabled per-agent or globally via `core.cache_enabled`. |
47 | | -- **Streaming metrics** (TTFT, duration, tokens/second) are only populated when the endpoint is configured for streaming. Non-streaming endpoints leave these fields as None. |
48 | | -- **The amender mutates TurnData in-place** — there is no copy. The original response (if pre-populated in eval data) is overwritten by the agent's response. |
| 67 | +- **Streaming metrics** (TTFT, duration, tokens/second) are populated for streaming and responses endpoint types. |
| 68 | +- **The amender mutates TurnData in-place** — the original response is overwritten. |
| 69 | +- **Proposal CR naming** uses `eval-{safe_conv_id}-{uuid8}` to avoid namespace collisions. |
| 70 | +- **KubeCLI timeout** is per-command (`cli_timeout`), while ProposalDriver `timeout` is the overall lifecycle timeout for reaching a terminal state. |
| 71 | +- **Responses endpoint** uses OpenAI Responses API schema — maps query→input, system_prompt→instructions, extracts file_search_call for RAG contexts and mcp_call for tool calls. |
0 commit comments