Skip to content

Tool call arguments lost (sent as {}) on OpenAI-compatible providers that stream id only in the first SSE chunk #1495

@alpo

Description

@alpo

When using an OpenAI-compatible provider whose streaming responses omit the id field on tool-call continuation chunks (e.g. GLM-5 via OpenRouter), aichat dispatches the tool with arguments: "{}" instead of the actual arguments. The tool fails with a missing-argument error and the LLM retries indefinitely.

The same prompt works correctly with providers that deliver all tool-call data in a single SSE chunk per call (e.g. Grok-4 via xAI).

To Reproduce

  1. Configure an OpenAI-compatible model that streams tool-call id only in the first delta chunk and argument fragments in subsequent chunks without repeating the id (GLM-5 via OpenRouter reproduces this reliably).
  2. Ask a question that causes the LLM to invoke a tool that has required arguments (e.g. web_search which requires query).
  3. Observe the tool is called with {} and returns run() missing 1 required positional argument: 'query'.

Expected behavior

All argument fragments are accumulated into a single JSON object and dispatched to the tool correctly.

Logs

GLM-5 stream for a single tool call:

Chunk 1 — id present, arguments empty

stream-data: {"choices":[{"delta":{"tool_calls":[{"index":0,"id":"call_95df1cc8dabc4b959bbbc431","type":"function","function":
{"name":"web_search","arguments":""}}]}}],"model":"z-ai/glm-5-20260211","provider":"AtlasCloud"}

Chunk 2 — id ABSENT, arguments still empty ← spurious boundary triggered here

stream-data: {"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":""}}]}}]}

Chunk 3 — id ABSENT, first real argument fragment

stream-data: {"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"max_results\": 10, \"query\": \"ESP32 command line development best practices\""}}]}}]}

Chunk 4 — id ABSENT, closing brace

stream-data: {"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"}"}}]}}]}

Follow-up request sent to the model — arguments reduced to {}:

{role:assistant,tool_calls:[{id:call_95df1cc8dabc4b959bbbc431,type:function,function:{name:web_search,arguments:{}}}]}
{role:tool,content:{"output":"run() missing 1 required positional argument: 'query'"},tool_call_id:
call_95df1cc8dabc4b959bbbc431}

Root cause

openai_chat_completions_streaming in src/client/openai.rs computes the call-boundary key as:

let maybe_call_id = format!("{}/{}", id.unwrap_or_default(), index.unwrap_or_default());
if maybe_call_id != call_id { ... flush and reset ... }

Per the OpenAI streaming spec, id is present only in the first delta chunk for a given tool call. Continuation chunks carry argument fragments with the same index but no id. When id is absent, id.unwrap_or_default() yields "", producing maybe_call_id = "/0". This does not match the stored call_id = "call_95df.../0", so a spurious boundary is detected: function_arguments is flushed (empty at that point) and cleared. The subsequent argument fragments are appended to the now-empty buffer, but the tool is dispatched with the flushed empty string, normalised to {}.

Prior fix attempts

Commit 7d33a2c

Commit 7d33a2c ("fix: stream function calling for some openai-compatible clients", 2024-11-29) introduced the {id}/{index} formula specifically to handle providers that reuse index: 0 across every tool call, where index alone is insufficient to detect boundaries. That commit also added a len() >= len() guard intended to suppress the spurious "/0" boundary on continuation chunks:

if maybe_call_id != call_id && maybe_call_id.len() >= call_id.len() {

This partially worked because "/0" (2 chars) is shorter than a real call id like "call_abc/0" (10 chars). However it broke parallel calls where a later call's id string happens to be shorter than the previous one's.

my previous attempt to fix (unpublished)

Removing the len() guard if maybe_call_id != call_id { fixes the parallel-calls regression but fully re-exposed the continuation-chunk boundary bug, which is what produces the {} arguments observed here.

Fix

Only advance the boundary when id is present in the chunk — i.e. only on the first chunk of a new call. Continuation chunks have no id and must pass through without touching the boundary logic:

// `id` is only present in the first chunk for a given tool call;
// continuation chunks omit it.  Only advance the boundary when `id`
// is present so that continuation chunks never trigger a spurious
// flush.  Using `{id}/{index}` retains uniqueness for providers
// that reuse `index: 0` across every call.
if let Some(call_id_str) = id {
let maybe_call_id = format!("{}/{}", call_id_str, index.unwrap_or_default());
if maybe_call_id != call_id {
// ... flush previous call, reset accumulators, update call_id
}
}

This preserves the {id}/{index} uniqueness from 7d33a2c, fixes the continuation-chunk regression without the fragile len() guard, and handles all three cases correctly:

Scenario Before fix After fix
Single-chunk call (Grok-4)
Multi-chunk call, id absent in continuations (GLM-5) ✗ args lost
Parallel calls, provider reuses index: 0
Parallel calls, standard incrementing index

Configuration

aichat --info
Relevant excerpt

model                   openrouter:z-ai/glm-5

Environment

  • OS: Linux (Void / Debian)
  • aichat version: main branch

Additional context

The bug affects all OpenAI-compatible clients that delegate to openai_chat_completions_streaming (including azure_openai, openai_compatible). Any provider whose streaming implementation follows the spec strictly — emitting id only on the first delta chunk — will trigger it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions