[Bug] Model Reasoning Timeout and Process Absence #6078
-
📦 部署环境Vercel 📌 软件版本latest 💻 系统环境Windows 🌐 浏览器Edge 🐛 问题描述深度推理模型(如gemini 2.0 flash thingking, o1等模型)在面对复杂问题(数学题)思考较长时间时,不像deepseek-r1那样展示推理过程。并且会因为推理时间较长(大约1分钟以上)而返回错误“没有收到服务器响应”。在openai或gemini官网使用这些模型时能看到推理过程。 Deep reasoning models (such as Gemini 2.0 Flash Thinking, o1, etc) do not show their reasoning process when tackling complex problems (like math questions) and thinking for extended periods, unlike DeepSeek-R1. Often, due to long processing times (around a minute or more), they will return an error such as "No server response received." However, the reasoning process is visible when using these models on the OpenAI or Gemini official websites. 📷 复现步骤测试的问题为中国2025年研究生入学考试的一道数学题,经过测试gemini 2.0 flash thinking回答这道题需要1-2分钟,o1与deepseek r1需要十分钟以上甚至思考时间超时(哪怕在官网)。 The test question was a math problem from China's 2025 postgraduate entrance exam. In testing, Gemini 2.0 Flash Thinking took 1-2 minutes to answer, while o1 and DeepSeek-R1 required over ten minutes, often timing out even on the official websites. 🚦 期望结果希望支持展示更多深度推理模型的推理过程,即使不展示其思考过程也延长从服务器的等待时间。 We hope to see support for displaying the reasoning process of more deep reasoning models. Even if the full thinking process isn't shown, please extend the server timeout period. 📝 补充信息No response |
Beta Was this translation helpful? Give feedback.
Replies: 11 comments 1 reply
-
|
👀 @git268 Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. |
Beta Was this translation helpful? Give feedback.
-
|
I found a similar closed issue related to the Gemini 2.0 Flash Thinking model. The issue was resolved by a pull request that added support for the Gemini flash thinking model [1]. Regarding your current issue, the lobe-chat framework does have a mechanism for displaying reasoning processes using a tooltip with a specific icon for models with reasoning capabilities. This is indicated by a purple tag with an However, I couldn't find specific information on extending server response times for these models or handling long processing times. It might be beneficial to explore the integration settings with external AI model APIs, as the framework uses a streaming approach to handle responses, which can affect response times [4][5][6]. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Beta Was this translation helpful? Give feedback.
-
|
gemini 2.0 flash thinking 在 API中是支持开启 CoT进而流式返回思维链的,但是目前 Google 自己的 sdk 中并不支持开启。可以再等一段时间看看 |
Beta Was this translation helpful? Give feedback.
-
Gemini 2.0 Flash Thinking supports the opening of COT and then returns the thinking chain, but currently does not support opening in Google's own SDK. You can wait for another time to see |
Beta Was this translation helpful? Give feedback.
-
|
https://ai.google.dev/gemini-api/docs/thinking
按文档的说法,目前就算换用新的 google-genai SDK,也不会给出思考过程,只能看见空的 thought 字段。 这边在 Python 下试了很多问题,response.thought 一直都是 None 。 |
Beta Was this translation helpful? Give feedback.
-
那么gpt的o1和o3系列呢?我使用微软github或openai提供的o3-mini均不给出思考过程 |
Beta Was this translation helpful? Give feedback.
-
|
@git268 根据 一些说法,甚至 OpenAI 官网展示的所谓思考过程也是再加工的产物。 从 OpenAI 文档中我没看到任何有关思维链输出的方法,等他们在 API 端发布这个功能吧。 |
Beta Was this translation helpful? Give feedback.
-
@git268 According to [some saying] (https://x.com/ai_FOR_SUCCCESS/status/1885935217745228014), and even the so -called thinking process displayed by the official website of Openai is also the product of processing. From the OpenAI document, I did not see any method about the output of the thinking chain, wait for them to release this feature. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
! [Image] (https://github.com/user-attachments/assets/faa245d3-912a-4279-831b-b8958657458F))))))) |
Beta Was this translation helpful? Give feedback.
-
|
关于响应超时问题,应该可以通过更换部署方式来解决。 Vercel 限制了 25s 最大等待时间。 |
Beta Was this translation helpful? Give feedback.





关于响应超时问题,应该可以通过更换部署方式来解决。
Vercel 限制了 25s 最大等待时间。