-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Git provider
Github Cloud
System Info
-
python:3.13.9
-
model:akila-agent-gpt-5-mini
-
SDK: openai-agents-python 0.5.x (Agents/Python 0.5.1 in User-Agent)
-
Model: GPT‑5‑mini (Azure deployment, reasoning enabled)
-
API: Responses API ( POST /responses ), invoked via Agents SDK Runner.run_streamed
-
Usage pattern:
-
Using Agents SDK with:
- Agent(...)
- Runner.run_streamed(starting_agent=agent, input=user_message, context=..., session=session, ...)
-
Session: custom history/session implementation (but the same issue reproduces even with a pure in-memory LimitedSession equivalent)
-
Reasoning enabled via extra_body={"reasoning": {"effort": ""}}
Bug details
When using the OpenAI Agents Python SDK with a reasoning-capable model (GPT‑5‑mini via Azure), I intermittently get(Small probability of occurrence, 100 problems will only occur once):
openai.BadRequestError: Error code: 400 - {
"error": {
"message": "Item with id 'rs_0ed518991eb0c6d5006968b233a1688194bddad2466d867a7b' not found.",
"type": "invalid_request_error",
"param": "input",
"code": null
}
}
However, in the request body sent to /responses , this rs_... id is clearly present as a type: "reasoning" item in the input array.
In logs I can see that other rs_... ids for the same conversation appear multiple times across the multi-step reasoning flow (both at creation and in later replays), but this specific rs_0ed5189...233a168...a7b only appears in the final /responses input , with no earlier “source” occurrence, which makes it look like a dangling reasoning item.
I suspect there is a bug in how Agents SDK / Responses API replays or reconstructs reasoning items across turns/sessions, causing an “orphan” reasoning id to be sent in input without a corresponding valid definition in the server-side item graph.
Reproduction scenario
This is not a single-user interactive chat. Instead:
- I have an Excel file with ~60 questions;
- I start 3 threads / processes to run through the same list in parallel;
- Each question uses a different session_id (no shared session between questions);
- For each question, I call the same Agents SDK pipeline:
- Agent + reasoning + tools (MCP tools + local data_query ), with session-based history;
- Most requests succeed; occasionally 1–2 requests fail with the Item with id 'rs_...' not found error above.
Even when I switch to a pure in-memory session implementation (no Redis), this intermittent error still happens, which suggests the issue is not in my Redis code but somewhere in how history + reasoning items are composed by the SDK / Responses backend.
The error from the API is:
Item with id 'rs_0ed518991eb0c6d5006968b233a1688194bddad2466d867a7b' not found.
param: "input"
type: "invalid_request_error"
Observed pattern
- For other reasoning ids (e.g., rs_...6dcf00 , rs_...daf42 , rs_...311a448... ), I can see:
- They are first created in earlier steps;
- Then they reappear in later /responses calls as part of replayed history.
- For the specific failing id rs_0ed518991eb0c6d5006968b233a1688194bddad2466d867a7b :
- It only appears in the final /responses input payload that fails;
- There is no earlier log record of it being created by a successful response event.
- This suggests that from the server’s perspective, this is a dangling reasoning item :
- The client (SDK) sends it as a reasoning item in input ,
- But the server’s internal item graph has no record of a valid “source” for this id,
- Hence the “Item with id 'rs_...' not found” error.
The issue happens intermittently, more often when:
- Running multiple threads / users over the same question list (many concurrent sessions hitting the same agent logic);
- Each question uses a distinct session_id , so there is no intentional cross-session sharing.
What I’m asking for
-
Diagnosis
- Is this a known issue in how the Agents SDK / Responses API reconstructs reasoning items across turns/sessions?
- Under what circumstances could a reasoning id appear in input without a corresponding server-side definition?
-
Mitigation or guidance
- Is there a recommended way (within the Agents SDK) to avoid producing such dangling reasoning items?
- Are there any configuration options to:
- disable internal use of previous_response_id , or
- avoid replaying reasoning items in history, or
- otherwise prevent invalid reasoning references, without fully turning off reasoning capabilities?
-
Long term
- Are there plans to make the API more robust here (e.g., either not generating these dangling references, or providing clearer diagnostics)?