Summary
When streaming a text completion, the provider's finish_reason (stop, length, content_filter) is never surfaced to the consumer of StreamResult. The stream simply ends after the last TextDelta, so a downstream consumer cannot tell why a streamed completion ended — most importantly, it cannot detect a truncated response where the model hit the output token limit (finish_reason: length).
The raw SSE response does carry this information; it is read internally and then discarded.
Root cause
In CompletionsConversionTrait::convertStream(), finish_reason is consumed only as a boolean for the incomplete-stream guard. The actual value is never emitted as a delta or attached to any result metadata:
|
$sawFinishReason = false; |
$sawFinishReason = false;
foreach ($result->getDataStream() as $data) {
// ...
// A non-null finish_reason on the leading choice marks the terminal content chunk.
// It is null on every non-final chunk, and a trailing usage-only chunk has choices: [].
if (null !== ($data['choices'][0]['finish_reason'] ?? null)) {
$sawFinishReason = true; // <-- value is reduced to a bool and then dropped
}
// ...
yield new TextDelta($data['choices'][0]['delta']['content']); // <-- last thing emitted for a text completion
}
// ...
if ($sawChunk && !$sawFinishReason) {
throw new IncompleteStreamException('...');
}
The only terminal deltas yielded are TextDelta (text), ThinkingComplete (reasoning), ToolCallComplete (tools), and TokenUsage. None of them carry the finish reason for text completions.
For tool calls, finish_reason is at least consulted to emit ToolCallComplete:
|
protected function isToolCallsStreamFinished(array $data): bool |
|
{ |
|
return isset($data['choices'][0]['finish_reason']) && 'tool_calls' === $data['choices'][0]['finish_reason']; |
protected function isToolCallsStreamFinished(array $data): bool
{
return isset($data['choices'][0]['finish_reason']) && 'tool_calls' === $data['choices'][0]['finish_reason'];
}
…but for stop, length, and content_filter there is no equivalent terminal signal. The reason is simply lost.
Reproduction
A minimal OpenAI-compatible SSE stream for a text completion that hits the output token limit:
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"}}]}
data: {"choices":[{"index":0,"delta":{"content":" world"}}]}
data: {"choices":[{"index":0,"delta":{},"finish_reason":"length"}]}
data: [DONE]
A consumer iterating convertStream() receives exactly:
TextDelta("Hello")
TextDelta(" world")
…and then the generator ends. There is no delta or metadata indicating finish_reason: length. The consumer cannot distinguish this truncated response from a clean stop.
Expected vs actual
- Expected: consumers can read the streamed finish reason and detect
length (truncation / max output tokens), content_filter, and stop.
- Actual: the finish reason is dropped for every non-
tool_calls completion; consumers must guess or report null.
Impact
Downstream agents/adapters that need to produce an accurate stop reason (e.g. to decide whether to continue generation after a length truncation, or to flag filtered content) have no signal in streaming mode. Notably, the non-streaming path (convertChoice()) does branch on finish_reason (stop/length → TextResult, tool_calls → ToolCallResult), so streaming is strictly less informative than non-streaming here — yet even non-streaming does not attach the raw reason to the result object.
Suggested fix
Capture the finish_reason value (not just a bool) and emit it as terminal stream metadata for all streamed completions (text, tool calls, content_filter). A MetadataDelta already exists and is documented for exactly this case — "structured metadata that only becomes available during or at the end of a stream":
$finishReason = null;
foreach ($result->getDataStream() as $data) {
// ...
if (null !== ($data['choices'][0]['finish_reason'] ?? null)) {
$finishReason = $data['choices'][0]['finish_reason'];
}
// ...existing yields...
}
if ($finishReason !== null) {
yield new MetadataDelta('finish_reason', $finishReason);
}
This keeps ToolCallComplete behavior intact, preserves the IncompleteStreamException guard ($finishReason === null ⇔ current !$sawFinishReason), and finally lets streaming consumers tell stop from length/content_filter.
Environment
- Reproduced against
main (225fb4a4) in src/platform/src/Bridge/Generic/Completions/CompletionsConversionTrait.php (convertStream()).
- Affects every bridge reusing
CompletionsConversionTrait against OpenAI-compatible providers (OpenAI, OpenAI-compatible / "generic", DeepSeek, Mistral, Cerebras, Scaleway, Docker Model Runner, etc.).
- Streaming only (for the missing-delta symptom); non-streaming branches on the reason but does not expose it either.
Summary
When streaming a text completion, the provider's
finish_reason(stop,length,content_filter) is never surfaced to the consumer ofStreamResult. The stream simply ends after the lastTextDelta, so a downstream consumer cannot tell why a streamed completion ended — most importantly, it cannot detect a truncated response where the model hit the output token limit (finish_reason: length).The raw SSE response does carry this information; it is read internally and then discarded.
Root cause
In
CompletionsConversionTrait::convertStream(),finish_reasonis consumed only as a boolean for the incomplete-stream guard. The actual value is never emitted as a delta or attached to any result metadata:ai/src/platform/src/Bridge/Generic/Completions/CompletionsConversionTrait.php
Line 43 in 225fb4a
The only terminal deltas yielded are
TextDelta(text),ThinkingComplete(reasoning),ToolCallComplete(tools), andTokenUsage. None of them carry the finish reason for text completions.For tool calls,
finish_reasonis at least consulted to emitToolCallComplete:ai/src/platform/src/Bridge/Generic/Completions/CompletionsConversionTrait.php
Lines 171 to 173 in 225fb4a
…but for
stop,length, andcontent_filterthere is no equivalent terminal signal. The reason is simply lost.Reproduction
A minimal OpenAI-compatible SSE stream for a text completion that hits the output token limit:
A consumer iterating
convertStream()receives exactly:TextDelta("Hello")TextDelta(" world")…and then the generator ends. There is no delta or metadata indicating
finish_reason: length. The consumer cannot distinguish this truncated response from a cleanstop.Expected vs actual
length(truncation / max output tokens),content_filter, andstop.tool_callscompletion; consumers must guess or reportnull.Impact
Downstream agents/adapters that need to produce an accurate stop reason (e.g. to decide whether to continue generation after a length truncation, or to flag filtered content) have no signal in streaming mode. Notably, the non-streaming path (
convertChoice()) does branch onfinish_reason(stop/length→TextResult,tool_calls→ToolCallResult), so streaming is strictly less informative than non-streaming here — yet even non-streaming does not attach the raw reason to the result object.Suggested fix
Capture the
finish_reasonvalue (not just a bool) and emit it as terminal stream metadata for all streamed completions (text, tool calls, content_filter). AMetadataDeltaalready exists and is documented for exactly this case — "structured metadata that only becomes available during or at the end of a stream":This keeps
ToolCallCompletebehavior intact, preserves theIncompleteStreamExceptionguard ($finishReason === null⇔ current!$sawFinishReason), and finally lets streaming consumers tellstopfromlength/content_filter.Environment
main(225fb4a4) insrc/platform/src/Bridge/Generic/Completions/CompletionsConversionTrait.php(convertStream()).CompletionsConversionTraitagainst OpenAI-compatible providers (OpenAI, OpenAI-compatible / "generic", DeepSeek, Mistral, Cerebras, Scaleway, Docker Model Runner, etc.).