Skip to content

[Platform][Generic] Streamed finish_reason (stop/length/content_filter) is never surfaced to consumers #2194

Description

@ineersa

Summary

When streaming a text completion, the provider's finish_reason (stop, length, content_filter) is never surfaced to the consumer of StreamResult. The stream simply ends after the last TextDelta, so a downstream consumer cannot tell why a streamed completion ended — most importantly, it cannot detect a truncated response where the model hit the output token limit (finish_reason: length).

The raw SSE response does carry this information; it is read internally and then discarded.

Root cause

In CompletionsConversionTrait::convertStream(), finish_reason is consumed only as a boolean for the incomplete-stream guard. The actual value is never emitted as a delta or attached to any result metadata:

$sawFinishReason = false;

foreach ($result->getDataStream() as $data) {
    // ...
    // A non-null finish_reason on the leading choice marks the terminal content chunk.
    // It is null on every non-final chunk, and a trailing usage-only chunk has choices: [].
    if (null !== ($data['choices'][0]['finish_reason'] ?? null)) {
        $sawFinishReason = true;   // <-- value is reduced to a bool and then dropped
    }
    // ...
    yield new TextDelta($data['choices'][0]['delta']['content']);  // <-- last thing emitted for a text completion
}

// ...
if ($sawChunk && !$sawFinishReason) {
    throw new IncompleteStreamException('...');
}

The only terminal deltas yielded are TextDelta (text), ThinkingComplete (reasoning), ToolCallComplete (tools), and TokenUsage. None of them carry the finish reason for text completions.

For tool calls, finish_reason is at least consulted to emit ToolCallComplete:

protected function isToolCallsStreamFinished(array $data): bool
{
return isset($data['choices'][0]['finish_reason']) && 'tool_calls' === $data['choices'][0]['finish_reason'];

protected function isToolCallsStreamFinished(array $data): bool
{
    return isset($data['choices'][0]['finish_reason']) && 'tool_calls' === $data['choices'][0]['finish_reason'];
}

…but for stop, length, and content_filter there is no equivalent terminal signal. The reason is simply lost.

Reproduction

A minimal OpenAI-compatible SSE stream for a text completion that hits the output token limit:

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"}}]}

data: {"choices":[{"index":0,"delta":{"content":" world"}}]}

data: {"choices":[{"index":0,"delta":{},"finish_reason":"length"}]}

data: [DONE]

A consumer iterating convertStream() receives exactly:

  • TextDelta("Hello")
  • TextDelta(" world")

…and then the generator ends. There is no delta or metadata indicating finish_reason: length. The consumer cannot distinguish this truncated response from a clean stop.

Expected vs actual

  • Expected: consumers can read the streamed finish reason and detect length (truncation / max output tokens), content_filter, and stop.
  • Actual: the finish reason is dropped for every non-tool_calls completion; consumers must guess or report null.

Impact

Downstream agents/adapters that need to produce an accurate stop reason (e.g. to decide whether to continue generation after a length truncation, or to flag filtered content) have no signal in streaming mode. Notably, the non-streaming path (convertChoice()) does branch on finish_reason (stop/lengthTextResult, tool_callsToolCallResult), so streaming is strictly less informative than non-streaming here — yet even non-streaming does not attach the raw reason to the result object.

Suggested fix

Capture the finish_reason value (not just a bool) and emit it as terminal stream metadata for all streamed completions (text, tool calls, content_filter). A MetadataDelta already exists and is documented for exactly this case — "structured metadata that only becomes available during or at the end of a stream":

$finishReason = null;

foreach ($result->getDataStream() as $data) {
    // ...
    if (null !== ($data['choices'][0]['finish_reason'] ?? null)) {
        $finishReason = $data['choices'][0]['finish_reason'];
    }
    // ...existing yields...
}

if ($finishReason !== null) {
    yield new MetadataDelta('finish_reason', $finishReason);
}

This keeps ToolCallComplete behavior intact, preserves the IncompleteStreamException guard ($finishReason === null ⇔ current !$sawFinishReason), and finally lets streaming consumers tell stop from length/content_filter.

Environment

  • Reproduced against main (225fb4a4) in src/platform/src/Bridge/Generic/Completions/CompletionsConversionTrait.php (convertStream()).
  • Affects every bridge reusing CompletionsConversionTrait against OpenAI-compatible providers (OpenAI, OpenAI-compatible / "generic", DeepSeek, Mistral, Cerebras, Scaleway, Docker Model Runner, etc.).
  • Streaming only (for the missing-delta symptom); non-streaming branches on the reason but does not expose it either.

Metadata

Metadata

Assignees

No one assigned

    Labels

    PlatformIssues & PRs about the AI Platform component

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions