Skip to content

Commit e5870ed

Browse files
committed
docs: document WS drift coverage, bump to 1.3.3
DRIFT.md: WS coverage table with verified/unverified status, Gemini Live explanation, cost estimate (25 API calls), "Adding a New Provider" WS step. README.md: fix Gemini Live response shape example, update model name, add unverified warning, fix Responses WS example to use flat format. docs/index.html: add unverified note to Gemini Live in feature list and comparison table. CHANGELOG.md: 1.3.3 patch notes. vitest.config.drift.ts: increase testTimeout to 60s for WS protocols.
1 parent 086f45f commit e5870ed

6 files changed

Lines changed: 55 additions & 18 deletions

File tree

CHANGELOG.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# @copilotkit/llmock
22

3+
## 1.3.3
4+
5+
### Patch Changes
6+
7+
- Fix Responses WS handler to accept flat `response.create` format matching the real OpenAI API (previously required a non-standard nested `response: { ... }` envelope)
8+
- WebSocket drift detection tests: TLS client for real provider WS endpoints, 4 verified drift tests (Responses WS + Realtime), Gemini Live canary for text-capable model availability
9+
- Realtime model canary: detects when `gpt-4o-mini-realtime-preview` is deprecated and suggests GA replacement
10+
- Gemini Live documented as unverified (no text-capable `bidiGenerateContent` model exists yet)
11+
- Fix README Gemini Live response shape example (`modelTurn.parts`, not `modelTurnComplete`)
12+
313
## 1.3.2
414

515
### Patch Changes

DRIFT.md

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,32 @@ When a model is deprecated:
101101
3. Add raw fetch client functions to `src/__tests__/drift/providers.ts`
102102
4. Create `src/__tests__/drift/<provider>.drift.ts` with 4 test scenarios
103103
5. Add model listing function to `providers.ts` and model check to `models.drift.ts`
104-
6. Update the allowlist in `schema.ts` if needed
104+
6. If the provider uses WebSocket, add protocol functions to `ws-providers.ts` and create `ws-<provider>.drift.ts`
105+
7. Update the allowlist in `schema.ts` if needed
106+
107+
## WebSocket Drift Coverage
108+
109+
In addition to the 19 existing drift tests (16 HTTP response-shape + 3 model deprecation), WebSocket drift tests cover llmock's WS protocols:
110+
111+
| Protocol | Text | Tool Call | Real Endpoint | Status |
112+
| ------------------- | ---- | --------- | ------------------------------------------------------------------- | ---------- |
113+
| OpenAI Responses WS ||| `wss://api.openai.com/v1/responses` | Verified |
114+
| OpenAI Realtime ||| `wss://api.openai.com/v1/realtime` | Verified |
115+
| Gemini Live ||| `wss://generativelanguage.googleapis.com/ws/...BidiGenerateContent` | Unverified |
116+
117+
**Models**: `gpt-4o-mini` for Responses WS, `gpt-4o-mini-realtime-preview` for Realtime.
118+
119+
**Auth**: Uses the same `OPENAI_API_KEY` and `GOOGLE_API_KEY` environment variables as HTTP tests. No new secrets needed.
120+
121+
**How it works**: A TLS WebSocket client (`ws-providers.ts`) connects to real provider endpoints using `node:tls` with RFC 6455 framing. Each protocol function handles the setup sequence (e.g., Realtime session negotiation, Gemini Live setup/setupComplete) and collects messages until a terminal event. The mock side uses the existing `ws-test-client.ts` plaintext client against the local llmock server.
122+
123+
### Gemini Live: unverified
124+
125+
llmock's Gemini Live handler implements the text-based `BidiGenerateContent` protocol as documented in Google's [Live API reference](https://ai.google.dev/api/live)`setup`/`setupComplete` handshake, `clientContent` with turns, `serverContent` with `modelTurn.parts[].text`, and `toolCall` responses. The protocol format is correct per the docs.
126+
127+
However, as of March 2026, the only models that support `bidiGenerateContent` are native-audio models (`gemini-2.5-flash-native-audio-*`), which reject text-only requests. No text-capable model exists for this endpoint yet, so we cannot triangulate llmock's output against a real API response.
128+
129+
A canary test (`ws-gemini-live.drift.ts`) queries the Gemini model listing API on each drift run and checks for a non-audio model that supports `bidiGenerateContent`. When Google ships one, the canary will flag it and the full drift tests can be enabled.
105130

106131
## CI Schedule
107132

@@ -115,4 +140,4 @@ See `.github/workflows/test-drift.yml`.
115140

116141
## Cost
117142

118-
~20 API calls per run using the cheapest available models (`gpt-4o-mini`, `claude-haiku-4-5-20251001`, `gemini-2.5-flash`) with 10-100 max tokens each. Under $0.01/week.
143+
~25 API calls per run (16 HTTP response-shape + 3 model listing + 4 WS + 2 canaries) using the cheapest available models (`gpt-4o-mini`, `gpt-4o-mini-realtime-preview`, `claude-haiku-4-5-20251001`, `gemini-2.5-flash`) with 10-100 max tokens each. Under $0.02/week. When Gemini Live text-capable models become available, this will increase to 6 WS calls.

README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -500,7 +500,7 @@ WebSocket endpoints:
500500

501501
- **WS `/v1/responses`** — OpenAI Responses API over WebSocket
502502
- **WS `/v1/realtime`** — OpenAI Realtime API (text + tool calls)
503-
- **WS `/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent`** — Gemini Live
503+
- **WS `/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent`** — Gemini Live ([unverified](#gemini-live-bidigeneratecontent))
504504

505505
All endpoints share the same fixture pool — the same fixtures work across all providers. Requests are translated to a common format internally for fixture matching.
506506

@@ -518,13 +518,11 @@ Connect to `ws://localhost:5555/v1/responses` and send a `response.create` event
518518
// → Client sends:
519519
{
520520
"type": "response.create",
521-
"response": {
522-
"modalities": ["text"],
523-
"instructions": "You are a helpful assistant.",
524-
"input": [
525-
{ "type": "message", "role": "user", "content": [{ "type": "input_text", "text": "Hello" }] },
526-
],
527-
},
521+
"model": "gpt-4o",
522+
"instructions": "You are a helpful assistant.",
523+
"input": [
524+
{ "type": "message", "role": "user", "content": [{ "type": "input_text", "text": "Hello" }] },
525+
],
528526
}
529527

530528
// ← Server streams:
@@ -567,19 +565,21 @@ Connect to `ws://localhost:5555/v1/realtime`. The Realtime API uses a session-ba
567565

568566
### Gemini Live (BidiGenerateContent)
569567

570-
Connect to `ws://localhost:5555/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent`. Gemini Live uses a setup/content/response flow:
568+
Connect to `ws://localhost:5555/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent`. Gemini Live uses a setup/content/response flow.
569+
570+
> **⚠️ Unverified**: As of March 2026, Google's only `bidiGenerateContent`-capable models are audio-only — no text-capable model exists for this endpoint. llmock implements the text-based protocol as documented in Google's [Live API reference](https://ai.google.dev/api/live), but the response shapes have not been verified against real API output. Code you write against this mock may need adjustment when Google ships a text-capable Live model. See [DRIFT.md](DRIFT.md#gemini-live-unverified) for details and the automated canary that tracks model availability.
571571
572572
```jsonc
573573
// → Setup message (must be first):
574-
{ "setup": { "model": "models/gemini-2.0-flash-live", "generationConfig": { "responseModalities": ["TEXT"] } } }
574+
{ "setup": { "model": "models/gemini-2.5-flash", "generationConfig": { "responseModalities": ["TEXT"] } } }
575575

576576
// → Send user content:
577577
{ "clientContent": { "turns": [{ "role": "user", "parts": [{ "text": "Hello" }] }], "turnComplete": true } }
578578

579579
// ← Server streams:
580580
// {"setupComplete": {}}
581-
// {"serverContent": {"modelTurnComplete": false, "parts": [{"text": "Hello"}]}}
582-
// {"serverContent": {"modelTurnComplete": true}}
581+
// {"serverContent": {"modelTurn": {"parts": [{"text": "Hello"}]}, "turnComplete": false}}
582+
// {"serverContent": {"modelTurn": {"parts": [{"text": "!"}]}, "turnComplete": true}}
583583
```
584584

585585
## CLI

docs/index.html

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1199,7 +1199,9 @@ <h3>WebSocket APIs</h3>
11991199
<ul>
12001200
<li>OpenAI Responses API over WebSocket</li>
12011201
<li>OpenAI Realtime API — text + tool calls</li>
1202-
<li>Gemini Live BidiGenerateContent</li>
1202+
<li>
1203+
Gemini Live BidiGenerateContent (unverified — no text-capable model exists yet)
1204+
</li>
12031205
<li>No audio/video — text and tool call paths only</li>
12041206
</ul>
12051207
</div>
@@ -1308,7 +1310,7 @@ <h2 class="section-title">llmock vs MSW</h2>
13081310
<td class="manual">Manual — build data SSE yourself</td>
13091311
</tr>
13101312
<tr>
1311-
<td>WebSocket APIs (Realtime, Gemini Live)</td>
1313+
<td>WebSocket APIs (Realtime, Gemini Live*)</td>
13121314
<td class="yes">Built-in ✓</td>
13131315
<td class="no">No</td>
13141316
</tr>

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@copilotkit/llmock",
3-
"version": "1.3.2",
3+
"version": "1.3.3",
44
"description": "Deterministic mock LLM server for testing (OpenAI, Anthropic, Gemini)",
55
"license": "MIT",
66
"packageManager": "pnpm@10.28.2",

vitest.config.drift.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,6 @@ export default defineConfig({
44
environment: "node",
55
globals: true,
66
include: ["src/__tests__/drift/**/*.drift.ts"],
7-
testTimeout: 30000,
7+
testTimeout: 60000,
88
},
99
});

0 commit comments

Comments
 (0)