Skip to content

[ChatOpenAI] tool_calls empty after concat() when streaming with Responses API and reasoning output #9816

@satotek

Description

@satotek

Checked other resources

  • This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

The following code:

import { ChatOpenAI } from '@langchain/openai';
import { tool } from '@langchain/core/tools';
import { z } from 'zod';
import type { AIMessageChunk } from '@langchain/core/messages';

const getWeather = tool(
  async ({ city }: { city: string }) => `${city} is sunny, 25°C`,
  {
    name: 'get_weather',
    description: 'Get weather for a city',
    schema: z.object({ city: z.string().describe('City name') }),
  },
);

const getTime = tool(
  async ({ timezone }: { timezone: string }) => `Current time in ${timezone}: 14:30`,
  {
    name: 'get_time',
    description: 'Get current time for a timezone',
    schema: z.object({ timezone: z.string().describe('Timezone name') }),
  },
);

// Bug occurs when model returns reasoning chunks (e.g., gpt-5-mini, gpt-5, gpt-5.2-chat)
const model = new ChatOpenAI({
  model: 'gpt-5-mini',
  streaming: true,
  useResponsesApi: true,
});

const modelWithTools = model.bindTools([getWeather, getTime], {
  tool_choice: 'auto',
  parallel_tool_calls: true,
});

const response = await modelWithTools.stream(
  'What is the weather in Tokyo and the current time in JST?'
);

const chunks: AIMessageChunk[] = [];
for await (const chunk of response) {
  console.log('Chunk tool_calls:', chunk.tool_calls);
  chunks.push(chunk);
}

const finalMessage = chunks.reduce((acc, chunk) => (acc ? acc.concat(chunk) : chunk));
console.log('Final tool_calls:', finalMessage.tool_calls);
// Expected: [{name: 'get_weather', ...}, {name: 'get_time', ...}]
// Actual: []

Error Message and Stack Trace (if applicable)

No error is thrown. The tool_calls array is silently empty after concat(), which causes LangGraph agents to hang indefinitely waiting for tool results.

Description

What I'm trying to do:
Use ChatOpenAI with useResponsesApi: true and parallel_tool_calls: true to stream responses with multiple tool calls.

Expected behavior:
After concatenating all streamed chunks, finalMessage.tool_calls should contain all tool calls (e.g., get_weather and get_time).

Actual behavior:

  • Individual chunks show tool_calls with data
  • After AIMessageChunk.concat(), tool_calls becomes an empty array []

Affected Models

The bug occurs when the model returns reasoning chunks (type: "reasoning") in the Responses API stream.

Confirmed affected (tested):

  • gpt-5
  • gpt-5-mini
  • gpt-5.2-chat

Not affected (no reasoning output):

  • gpt-4o
  • gpt-5.1
  • gpt-5.2

Test Results

Without Patch (Current Behavior)

Model ResponsAPI/Parallel ResponsAPI/not Parallel Complete/Parallel Complete/not Parallel
gpt-4o ✅ OK ✅ OK ✅ OK ✅ OK
gpt-5.1 ✅ OK ✅ OK ✅ OK ✅ OK
gpt-5.2 ✅ OK ✅ OK ✅ OK ✅ OK
gpt-5.2-chat BUG ✅ OK ✅ OK ✅ OK
gpt-5-mini BUG ✅ OK ✅ OK ✅ OK
gpt-5 BUG ✅ OK ✅ OK ✅ OK

With Proposed Patch

Model ResponsAPI/Parallel ResponsAPI/not Parallel Complete/Parallel Complete/not Parallel
gpt-4o ✅ OK ✅ OK ✅ OK ✅ OK
gpt-5.1 ✅ OK ✅ OK ✅ OK ✅ OK
gpt-5.2 ✅ OK ✅ OK ✅ OK ✅ OK
gpt-5.2-chat ✅ OK ✅ OK ✅ OK ✅ OK
gpt-5-mini ✅ OK ✅ OK ✅ OK ✅ OK
gpt-5 ✅ OK ✅ OK ✅ OK ✅ OK

Root Cause Analysis

When does the bug occur?

The bug occurs when the Responses API stream includes reasoning chunks (type: "reasoning") before function_call outputs. This shifts the output_index of subsequent items.

Normal event stream (no reasoning output):

response.output_item.added (function_call, output_index: 0)
response.function_call_arguments.delta (output_index: 0)
response.output_item.done (function_call)

Event stream with reasoning output:

response.output_item.added (reasoning, output_index: 0)  ← reasoning chunk first!
response.output_item.added (function_call, output_index: 1)
response.function_call_arguments.delta (output_index: 1)
response.output_item.done (function_call)

The Bug

In @langchain/openai/dist/converters/responses.js, the response.function_call_arguments.delta event handler creates tool_call_chunks without the id field:

// Line ~385 in responses.js
else if (event.type === "response.function_call_arguments.delta" || event.type === "response.custom_tool_call_input.delta")
  tool_call_chunks.push({
    type: "tool_call_chunk",
    args: event.delta,
    index: event.output_index
    // NO id field!
  });

When AIMessageChunk.concat() merges chunks, it uses id to match and merge tool calls. Since delta events lack id, the merging fails and tool_calls become empty.


Proposed Fix

Change the function_call handler from response.output_item.added to response.output_item.done, and skip response.function_call_arguments.delta events for function_call:

Change 1: Use response.output_item.done instead of response.output_item.added

- else if (event.type === "response.output_item.added" && event.item.type === "function_call") {
+ else if (event.type === "response.output_item.done" && event.item.type === "function_call") {
    tool_call_chunks.push({
      type: "tool_call_chunk",
      name: event.item.name,
      args: event.item.arguments,
      id: event.item.call_id,
      index: event.output_index
    });
    additional_kwargs[_FUNCTION_CALL_IDS_MAP_KEY] = { [event.item.call_id]: event.item.id };
  }

Change 2: Skip response.function_call_arguments.delta events

- } else if (event.type === "response.function_call_arguments.delta" || event.type === "response.custom_tool_call_input.delta") tool_call_chunks.push({
+ } else if (event.type === "response.custom_tool_call_input.delta") tool_call_chunks.push({
    type: "tool_call_chunk",
    args: event.delta,
    index: event.output_index
  });

Why this works:

  • response.output_item.done contains complete data including call_id (which becomes id)
  • Skipping incremental delta events avoids creating chunks without id
  • This approach has been tested and fixes all affected models without breaking models that don't return reasoning chunks

System Info

pnpm: 10.25.0
Node.js: v24.12.0
Platform: Ubuntu 24.04.1

  • @langchain/openai: 1.2.2
  • @langchain/core: 1.1.13

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions