llm-raw-stream-capture-20260617-170424-7573dfc7.json
Summary
Streaming responses that emit multiple parallel tool calls collapse into a single tool call. Every tool call after the first is silently dropped from the final ToolCallComplete delta, and the first call's accumulated argument fragments are overwritten and lost.
A single tool call per turn works correctly. The bug only manifests when a model returns 2+ parallel tool calls in one assistant turn, which is the standard behavior of OpenAI-compatible providers for parallel function calling.
Root cause
In CompletionsConversionTrait, both convertStreamToToolCalls() and yieldToolCallDeltas() iterate delta.tool_calls using the PHP array key as the tool-call slot:
|
foreach ($data['choices'][0]['delta']['tool_calls'] as $i => $toolCall) { |
|
if (isset($toolCall['id'])) { |
|
// initialize tool call |
|
$toolCalls[$i] = [ |
|
'id' => $toolCall['id'], |
|
'function' => $toolCall['function'], |
|
]; |
|
continue; |
|
} |
|
|
|
// add arguments delta to tool call |
|
if (isset($toolCall['function']['arguments'])) { |
|
if (!isset($toolCalls[$i]['function']['arguments'])) { |
|
$toolCalls[$i]['function']['arguments'] = ''; |
|
} |
|
|
|
$toolCalls[$i]['function']['arguments'] .= $toolCall['function']['arguments']; |
foreach ($data['choices'][0]['delta']['tool_calls'] as $i => $toolCall) {
if (isset($toolCall['id'])) {
// initialize tool call
$toolCalls[$i] = [
'id' => $toolCall['id'],
'function' => $toolCall['function'],
];
continue;
}
// add arguments delta to tool call at $i ...
}
The same $i array key is used in yieldToolCallDeltas():
|
foreach ($data['choices'][0]['delta']['tool_calls'] ?? [] as $i => $toolCall) { |
|
if (isset($toolCall['id'])) { |
|
yield new ToolCallStart($toolCall['id'], $toolCall['function']['name']); |
|
} elseif (isset($toolCall['function']['arguments'])) { |
|
yield new ToolInputDelta($toolCalls[$i]['id'] ?? '', $toolCalls[$i]['function']['name'] ?? '', $toolCall['function']['arguments']); |
|
} |
The problem: in an OpenAI-compatible stream, each chunk carries a single-element tool_calls array, so the PHP array key $i is always 0, regardless of which tool call the chunk belongs to. The real tool-call position is carried in the tool_calls[].index field, which is never read.
Consequences:
- Every tool call is written to
$toolCalls[0], overwriting the previous one.
- Argument-only chunks for any tool are appended to
$toolCalls[0] (the last surviving call).
- At
finish_reason: tool_calls, only $toolCalls[0] (the last call) survives.
Reproduction
A minimal OpenAI-compatible SSE stream for two parallel tool calls. Note each data: chunk's delta.tool_calls is a single-element array with the real position in index:
data: {"choices":[{"index":0,"delta":{"role":"assistant","tool_calls":[{"index":0,"id":"call_a","type":"function","function":{"name":"get_weather","arguments":""}}]}}]}
data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"city\":"}}]}}]}
data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\"Paris\"}"}}]}}]}
data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":1,"id":"call_b","type":"function","function":{"name":"get_time","arguments":""}}]}}]}
data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":1,"function":{"arguments":"{\"tz\":"}}]}}]}
data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":1,"function":{"arguments":"\"CET\"}"}}]}}]}
data: {"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}
data: [DONE]
Trace through convertStreamToToolCalls() (PHP array key $i in parentheses):
chunk (index) |
$i |
branch |
resulting $toolCalls |
index:0, id:call_a |
0 |
id set |
$toolCalls[0] = call_a |
index:0, args |
0 |
args |
call_a += {"city": |
index:0, args |
0 |
args |
call_a += "Paris"} |
index:1, id:call_b |
0 |
id set |
$toolCalls[0] = call_b (overwrites call_a) |
index:1, args |
0 |
args |
call_b += {"tz": |
index:1, args |
0 |
args |
call_b += "CET"} |
| finish: tool_calls |
— |
— |
ToolCallComplete([call_b]) |
Expected vs actual
- Expected:
ToolCallComplete with two tool calls — call_a (get_weather, {city: "Paris"}) and call_b (get_time, {tz: "CET"}). ToolCallStart/ToolInputDelta deltas are emitted for both.
- Actual:
ToolCallComplete with one tool call — only call_b. call_a and all of its argument fragments are gone.
Suggested fix
Key the accumulator by the provider-supplied index instead of the PHP array key, e.g.:
foreach ($data['choices'][0]['delta']['tool_calls'] as $i => $toolCall) {
$index = $toolCall['index'] ?? $i; // prefer provider index
if (isset($toolCall['id'])) {
$toolCalls[$index] = [...];
continue;
}
// append arguments to $toolCalls[$index] ...
}
…and the same index-based lookup in yieldToolCallDeltas().
This keeps single-tool-call behavior identical (index: 0) while correctly separating parallel tool calls. There are also a couple of related robustness gaps worth noting in the same area:
- The terminating
finish_reason: tool_calls chunk often arrives with id/type/function.name set to null on its trailing argument fragment. isset($toolCall['id']) is false for null, so it falls through to the argument branch — which works, but only by accident of the (buggy) index handling. Keying on index makes this robust.
- Argument fragments that arrive before the tool call's
id/name start chunk (i.e. an index with no initialized entry yet) are currently dropped; a keyed accumulator could buffer them.
Environment
- Reproduced against
main (225fb4a4) in src/platform/src/Bridge/Generic/Completions/CompletionsConversionTrait.php.
- Affects every bridge reusing the
CompletionsConversionTrait against OpenAI-compatible providers (OpenAI, OpenAI-compatible / "generic", DeepSeek, Mistral, Cerebras, Scaleway, Docker Model Runner, etc.).
- Also affects non-streaming? No —
convertChoice() operates on the fully-assembled choice.message.tool_calls and is not affected. Streaming only.
I'll attach a captured real-provider event/delta trace as a comment for additional evidence.
llm-raw-stream-capture-20260617-170424-7573dfc7.json
Summary
Streaming responses that emit multiple parallel tool calls collapse into a single tool call. Every tool call after the first is silently dropped from the final
ToolCallCompletedelta, and the first call's accumulated argument fragments are overwritten and lost.A single tool call per turn works correctly. The bug only manifests when a model returns 2+ parallel tool calls in one assistant turn, which is the standard behavior of OpenAI-compatible providers for parallel function calling.
Root cause
In
CompletionsConversionTrait, bothconvertStreamToToolCalls()andyieldToolCallDeltas()iteratedelta.tool_callsusing the PHP array key as the tool-call slot:ai/src/platform/src/Bridge/Generic/Completions/CompletionsConversionTrait.php
Lines 120 to 136 in 225fb4a
The same
$iarray key is used inyieldToolCallDeltas():ai/src/platform/src/Bridge/Generic/Completions/CompletionsConversionTrait.php
Lines 151 to 156 in 225fb4a
The problem: in an OpenAI-compatible stream, each chunk carries a single-element
tool_callsarray, so the PHP array key$iis always0, regardless of which tool call the chunk belongs to. The real tool-call position is carried in thetool_calls[].indexfield, which is never read.Consequences:
$toolCalls[0], overwriting the previous one.$toolCalls[0](the last surviving call).finish_reason: tool_calls, only$toolCalls[0](the last call) survives.Reproduction
A minimal OpenAI-compatible SSE stream for two parallel tool calls. Note each
data:chunk'sdelta.tool_callsis a single-element array with the real position inindex:Trace through
convertStreamToToolCalls()(PHP array key$iin parentheses):index)$i$toolCallsindex:0, id:call_a$toolCalls[0] = call_aindex:0, argscall_a+={"city":index:0, argscall_a+="Paris"}index:1, id:call_b$toolCalls[0] = call_b(overwrites call_a)index:1, argscall_b+={"tz":index:1, argscall_b+="CET"}ToolCallComplete([call_b])Expected vs actual
ToolCallCompletewith two tool calls —call_a(get_weather,{city: "Paris"}) andcall_b(get_time,{tz: "CET"}).ToolCallStart/ToolInputDeltadeltas are emitted for both.ToolCallCompletewith one tool call — onlycall_b.call_aand all of its argument fragments are gone.Suggested fix
Key the accumulator by the provider-supplied
indexinstead of the PHP array key, e.g.:…and the same
index-based lookup inyieldToolCallDeltas().This keeps single-tool-call behavior identical (
index: 0) while correctly separating parallel tool calls. There are also a couple of related robustness gaps worth noting in the same area:finish_reason: tool_callschunk often arrives withid/type/function.nameset tonullon its trailing argument fragment.isset($toolCall['id'])is false fornull, so it falls through to the argument branch — which works, but only by accident of the (buggy) index handling. Keying onindexmakes this robust.id/namestart chunk (i.e. anindexwith no initialized entry yet) are currently dropped; a keyed accumulator could buffer them.Environment
main(225fb4a4) insrc/platform/src/Bridge/Generic/Completions/CompletionsConversionTrait.php.CompletionsConversionTraitagainst OpenAI-compatible providers (OpenAI, OpenAI-compatible / "generic", DeepSeek, Mistral, Cerebras, Scaleway, Docker Model Runner, etc.).convertChoice()operates on the fully-assembledchoice.message.tool_callsand is not affected. Streaming only.I'll attach a captured real-provider event/delta trace as a comment for additional evidence.