You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+14Lines changed: 14 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,20 @@
7
7
-**Model-aware fixture recording** — recorded fixtures now include the model name in match criteria, preventing collisions when an app makes multiple LLM calls with the same user message but different models. Model names are normalized by stripping date/version suffixes (e.g., `claude-opus-4-20250514` → `claude-opus-4`) so fixtures survive version bumps. Disable with `recordFullModelVersion: true`. ([#185](https://github.com/CopilotKit/aimock/issues/185))
8
8
-**Drift detection metadata** — recorded fixtures include `systemHash` and `toolsHash` in a `metadata` block for detecting system prompt or tool definition changes since recording.
9
9
-**Prefix model matching** — fixture router uses `startsWith` for string model matching, so `model: "claude-opus-4"` matches any `claude-opus-4-*` version.
10
+
-**GA Realtime protocol migration with Beta compatibility shim** — handler emits GA event names natively; `sendEvent()` wrapper translates back for Beta clients detected via `OpenAI-Beta` header. Default model changed to `gpt-realtime-2`.
11
+
-**5 new GA Realtime models** — `gpt-realtime-2`, `gpt-realtime-1.5`, `gpt-realtime-mini`, `gpt-realtime-translate`, `gpt-realtime-whisper`.
12
+
-**Translate and whisper session types** — dedicated session configurations for translation and transcription workloads on the Realtime API.
13
+
-**Image input support** — Realtime sessions accept image content parts alongside text and audio.
14
+
-**Commentary phase** — Realtime handler supports the GA commentary phase for model-generated annotations.
15
+
-**`conversation.item.done` and `response.cancel` events** — new GA Realtime event types for item completion tracking and response cancellation.
16
+
-**Endpoint type routing for Realtime** — router distinguishes GA vs Beta Realtime endpoints for fixture matching.
17
+
-**Drift detection for GA Realtime** — drift test suite extended with GA protocol shapes, Beta conformance shapes, and three-way triangulation.
18
+
19
+
### Tests
20
+
21
+
-**73 GA Realtime integration tests** — comprehensive test coverage for all GA event types, Beta compatibility, session management, model routing, image input, translate/whisper, commentary, and cancellation.
22
+
-**GA and Beta Realtime conformance suites** — API conformance tests validating event shapes against both GA and Beta protocol specs.
23
+
-**GA Realtime drift detection** — SDK shape tests and provider triangulation for the GA Realtime protocol.
Copy file name to clipboardExpand all lines: DRIFT.md
+15-8Lines changed: 15 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -107,7 +107,7 @@ When a model is deprecated:
107
107
108
108
## WebSocket Drift Coverage
109
109
110
-
In addition to the 23 existing drift tests (20 HTTP response-shape + 3 model deprecation), WebSocket drift tests cover aimock's WS protocols (4 verified + 2 canary = 6 WS tests):
110
+
In addition to the 23 existing drift tests (20 HTTP response-shape + 3 model deprecation), WebSocket drift tests cover aimock's WS protocols (6 verified + 2 canary = 8 WS tests):
111
111
112
112
### Gemini Interactions API (Beta)
113
113
@@ -120,13 +120,20 @@ The Gemini Interactions API (`/v1beta/interactions`) is covered by 4 drift tests
120
120
121
121
Uses `describe.skipIf(!GOOGLE_API_KEY)` like other Gemini tests. The Interactions API is in Beta — shapes may shift as Google iterates on the endpoint.
122
122
123
-
| Protocol | Text | Tool Call | Real Endpoint | Status |
| Gemini Live | — | — |`wss://generativelanguage.googleapis.com/ws/...BidiGenerateContent`| Unverified |
128
129
129
-
**Models**: `gpt-4o-mini` for Responses WS, `gpt-4o-mini-realtime-preview` for Realtime.
130
+
**Models**: `gpt-4o-mini` for Responses WS, `gpt-realtime-2` for Realtime GA (was `gpt-4o-mini-realtime-preview`).
131
+
132
+
**GA Realtime Drift Tests**:
133
+
134
+
-**Model canary** — Verifies all 5 GA models exist (`gpt-realtime-2`, `gpt-realtime-1.5`, `gpt-realtime-mini`, `gpt-realtime-translate`, `gpt-realtime-whisper`) and flags unknown realtime models
135
+
-**Protocol probe** — Connects with both GA and Beta protocol, normalizes event sequences, and verifies consistency
136
+
-**Event shape validation** — GA event names (`response.output_text.delta`, `conversation.item.added`, `conversation.item.done`) and nested session config (`session.audio.*`, `session.type`, `session.reasoning`)
130
137
131
138
**Auth**: Uses the same `OPENAI_API_KEY` and `GOOGLE_API_KEY` environment variables as HTTP tests. No new secrets needed.
132
139
@@ -175,4 +182,4 @@ The fix workflow also supports `workflow_dispatch` for manual runs.
175
182
176
183
## Cost
177
184
178
-
~29 API calls per run (20 HTTP response-shape + 3 model listing + 6 WS including canaries) using the cheapest available models (`gpt-4o-mini`, `gpt-4o-mini-realtime-preview`, `claude-haiku-4-5-20251001`, `gemini-2.5-flash`) with 10-100 max tokens each. Under $0.20/week at daily cadence. When Gemini Live text-capable models become available, the 2 canary tests will become full drift tests, increasing real WS connections from 4 to 6.
185
+
~31 API calls per run (20 HTTP response-shape + 3 model listing + 8 WS including canaries) using the cheapest available models (`gpt-4o-mini`, `gpt-realtime-2`, `claude-haiku-4-5-20251001`, `gemini-2.5-flash`) with 10-100 max tokens each. Under $0.25/week at daily cadence. The GA protocol probe adds a second Realtime WS connection (one GA, one Beta) per run. When Gemini Live text-capable models become available, the 2 canary tests will become full drift tests, increasing real WS connections from 6 to 8.
Run them all on one port with `npx @copilotkit/aimock --config aimock.json`, or use the programmatic API to compose exactly what you need.
48
48
49
49
## Features
50
50
51
51
-**[Record & Replay](https://aimock.copilotkit.dev/record-replay)** — Proxy real APIs, save as fixtures, replay deterministically forever
52
52
-**[Multi-turn Conversations](https://aimock.copilotkit.dev/multi-turn)** — Record and replay multi-turn traces with tool rounds; match distinct turns via `turnIndex`, `hasToolResult`, `toolCallId`, `sequenceIndex`, `systemMessage` (gate on host-supplied agent context), or custom predicates
-**[MCP](https://aimock.copilotkit.dev/mcp-mock) / [A2A](https://aimock.copilotkit.dev/a2a-mock) / [AG-UI](https://aimock.copilotkit.dev/agui-mock) / [Vector](https://aimock.copilotkit.dev/vector-mock)** — Mock every protocol your AI agents use
56
56
-**[Chaos Testing](https://aimock.copilotkit.dev/chaos-testing)** — 500 errors, malformed JSON, mid-stream disconnects at any probability
57
57
-**Per-Request Strict Mode** — `X-AIMock-Strict` header overrides the server-level `--strict` flag per request (`true`/`1` = strict, `false`/`0` = lenient)
58
58
-**[Drift Detection](https://aimock.copilotkit.dev/drift-detection)** — Daily CI validation against real APIs
59
59
-**[Streaming Physics](https://aimock.copilotkit.dev/streaming-physics)** — Configurable `ttft`, `tps`, and `jitter`
60
-
-**[WebSocket APIs](https://aimock.copilotkit.dev/websocket)** — OpenAI Realtime, Responses WS, Gemini Live
0 commit comments