Enhance streaming with first-chunk timeout and auto-retry logic#9
Merged
Conversation
Wrap the streaming response iterator to enforce a configurable first_chunk_timeout (default 15s). If the server does not send any data within that window, raise FirstChunkTimeoutError so callers can distinguish a stalled connection from a real timeout.
Catch FirstChunkTimeoutError during provider streaming and retry once before propagating. This handles transient cold-start delays from upstream LLM endpoints without failing the entire turn.
Remove the complex HA built-in intent pipeline with its multi-layer response classification. Now only Chinese input is matched against local yaml-driven intents; everything else falls through to LLM. Eliminates ~110 lines of branching logic that caused false positives.
…lures Remove blocking API validation at startup. Wrap intent sync, intent handler setup, service registration and each platform forward in independent try/except so a single subsystem failure no longer prevents the entire integration from loading. Also apply the same per-platform isolation to unload. Bump version to v2026.04.13.
…cal match" This reverts commit c21e727.
Re-add __init__.py with isolated try/except for intent sync, intent handlers, services and per-platform forward. Single subsystem failure no longer blocks the entire integration from loading.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
1. 流式响应首块超时检测 [http.py]
新增
FirstChunkTimeoutError异常类型和首块超时机制。对 LLM 流式请求,在收到第一个数据块之前设置独立的超时窗口(默认 15 秒)。如果上游服务在该窗口内未返回任何数据,抛出FirstChunkTimeoutError,与普通的连接超时明确区分,便于上层做针对性重试。2. 流式调用首块超时自动重试 [entity.py]
在
_async_run_provider_stream中捕获FirstChunkTimeoutError,自动重试一次。应对上游 LLM 冷启动或负载均衡调度延迟,避免因单次首块卡顿直接报错中断整个对话轮次。重试次数限制为 1 次,超出后正常抛出异常。3. 容错启动与逐模块隔离 [init.py]
移除启动时的 API Key 阻塞校验:不再在
async_setup_entry中调用validate_input,避免因网络波动或 API 暂时不可达导致整个集成无法加载(ConfigEntryNotReady)。改为各实体在实际使用时按需校验。子系统独立 try/except:意图列表同步、意图处理器注册、服务注册三个阶段各自捕获异常,任一失败仅 warning 日志,不阻塞后续流程。
逐 Platform 加载:将原来的
async_forward_entry_setups(entry, PLATFORMS)批量调用改为逐个 platform 独立加载,单个 platform(如 sensor、tts、stt、conversation)失败不影响其他 platform 正常工作。Unload 同样隔离:卸载时逐 platform 独立处理,避免单个 platform 卸载异常导致整个集成残留。
版本号升至
v2026.04.13。