perf(cli): Optimize serve daemon startup by doudouOUC · Pull Request #5785 · QwenLM/qwen-code

doudouOUC · 2026-06-23T17:31:36Z

What this PR does

This PR makes qwen serve reach the HTTP listener earlier by adding a slim serve fast path, deferring interactive UI, React/Ink, full settings, web-shell, and ACP runtime work until after the listener is ready or until the relevant mode needs it. It also adds daemon startup observability so /daemon/status and stderr report process-to-listen timing, runQwenServe-to-listen timing, and ACP preheat state without changing the stdout listening contract.

Why it's needed

Cold-start latency for qwen serve was dominated by modules that are not needed before the daemon can bind a port, including interactive UI and full runtime setup. Moving those imports and runtime work off the pre-listen path improves daemon responsiveness while keeping the deeper first-session ACP initialization out of scope. The added timing fields make future regressions diagnosable from both local logs and /daemon/status.

Reviewer Test Plan

How to verify

Run cd packages/cli && npx vitest run src/serve/fast-path.test.ts src/serve/run-qwen-serve.test.ts src/serve/daemon-status.test.ts src/commands/serve.test.ts src/serve/daemon-logger.test.ts and expect all 137 tests to pass. Run cd packages/cli && npx prettier --check src/serve/daemon-logger.ts src/serve/run-qwen-serve.ts src/serve/fast-path-settings.ts src/serve/run-qwen-serve.test.ts src/serve/fast-path.test.ts src/gemini.tsx src/ui/startInteractiveUI.tsx src/config/settings.ts src/config/environment.ts src/config/default-theme-names.ts and expect Prettier to report all matched files use the configured style. Run git diff --cached --check before commit, or git diff HEAD^..HEAD --check after commit, and expect no whitespace errors. Optionally run node scripts/dev.js serve --port 0 --hostname 127.0.0.1 --no-open --no-web repeatedly and confirm the stdout line remains qwen serve listening on ... while stderr includes startup timing.

Evidence (Before & After)

Earlier local baseline for external command to listener ready was approximately 3.55s p50. After this change, a local 5-run sample of node scripts/dev.js serve --port 0 --hostname 127.0.0.1 --no-open --no-web produced runs [1068, 1246, 1092, 2364, 2729]ms, p50 1246ms; daemon stderr reported process-to-listen p50 around 383ms and runQwenServe-to-listen p50 around 68ms.

Tested on

OS	Status
🍏 macOS	✅ tested
🪟 Windows	⚠️ not tested
🐧 Linux	⚠️ not tested

Environment (optional)

macOS Darwin 25.4.0 arm64, Node v22.22.3, npm 10.9.8.

Risk & Scope

Main risk or tradeoff: the serve path now has a lightweight bootstrap app that delegates to the full runtime after listen, so the most important risk is preserving compatibility for embedded runQwenServe callers and for settings/runtime directory resolution.
Not validated / out of scope: full npm run typecheck is still blocked by an existing unrelated BaseTextInput.tsx issue where ink/dom and ink/components/CursorContext cannot be resolved and cursorCtx is inferred as unknown; this PR does not fix that TUI build blocker. First POST /session ACP deep optimization is also out of scope.
Breaking changes / migration notes: no intentional breaking changes. The stdout listening line is preserved; new startup data is additive in stderr and /daemon/status.

Linked Issues

N/A

中文说明

What this PR does

这个 PR 通过增加轻量的 serve fast path，让 qwen serve 更早到达 HTTP listener；交互 UI、React/Ink、完整 settings、web-shell 和 ACP runtime 工作会延迟到 listener ready 之后或对应模式真正需要时再加载。同时补齐 daemon 启动观测，让 /daemon/status 和 stderr 暴露 process-to-listen、runQwenServe-to-listen 以及 ACP preheat 状态，并保持 stdout listening 行不变。

Why it's needed

qwen serve 冷启动耗时主要被 listen 前不需要的模块拖慢，包括交互 UI 和完整 runtime 初始化。把这些 import 与 runtime 工作移出 pre-listen 路径，可以提升 daemon 绑定端口的响应速度，同时不把更深的首个 session ACP 初始化混入本 PR。新增 timing 字段也让后续启动回退可以从本地日志和 /daemon/status 直接定位。

Reviewer Test Plan

How to verify

运行 cd packages/cli && npx vitest run src/serve/fast-path.test.ts src/serve/run-qwen-serve.test.ts src/serve/daemon-status.test.ts src/commands/serve.test.ts src/serve/daemon-logger.test.ts，预期 137 个测试全部通过。运行 cd packages/cli && npx prettier --check src/serve/daemon-logger.ts src/serve/run-qwen-serve.ts src/serve/fast-path-settings.ts src/serve/run-qwen-serve.test.ts src/serve/fast-path.test.ts src/gemini.tsx src/ui/startInteractiveUI.tsx src/config/settings.ts src/config/environment.ts src/config/default-theme-names.ts，预期 Prettier 报告所有匹配文件符合格式。提交前运行 git diff --cached --check，或提交后运行 git diff HEAD^..HEAD --check，预期没有 whitespace 错误。也可以重复运行 node scripts/dev.js serve --port 0 --hostname 127.0.0.1 --no-open --no-web，确认 stdout 仍输出 qwen serve listening on ...，stderr 包含 startup timing。

Evidence (Before & After)

此前本地 external command 到 listener ready 的基线约为 3.55s p50。变更后，本地 5 次运行 node scripts/dev.js serve --port 0 --hostname 127.0.0.1 --no-open --no-web 得到 [1068, 1246, 1092, 2364, 2729]ms，p50 为 1246ms；daemon stderr 自报 process-to-listen p50 约 383ms，runQwenServe-to-listen p50 约 68ms。

Tested on

OS	Status
🍏 macOS	✅ tested
🪟 Windows	⚠️ not tested
🐧 Linux	⚠️ not tested

Environment (optional)

macOS Darwin 25.4.0 arm64，Node v22.22.3，npm 10.9.8。

Risk & Scope

Main risk or tradeoff: serve 路径现在有一个轻量 bootstrap app，并在 listen 后委托给完整 runtime，因此主要风险是保持嵌入式 runQwenServe 调用方以及 settings/runtime 目录解析的兼容性。
Not validated / out of scope: 完整 npm run typecheck 仍被既有且无关的 BaseTextInput.tsx 问题阻塞：ink/dom 和 ink/components/CursorContext 无法解析，cursorCtx 被推断为 unknown；本 PR 不修复该 TUI build blocker。首个 POST /session 的 ACP 深度优化也不在本 PR 范围内。
Breaking changes / migration notes: 没有预期的 breaking change。stdout listening 行保持兼容；新增启动信息只以 additive 方式出现在 stderr 和 /daemon/status。

Linked Issues

N/A

Slim the qwen serve startup path by deferring interactive UI, runtime, web-shell, and settings-heavy imports until after the listener is ready. Add daemon startup timing, preheat status reporting, fast-path settings/env loading, and regression coverage for import boundaries and runtime directory behavior. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Resolve runQwenServe conflicts after upstream serve and LSP changes while preserving the fast-path runtime split. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

qwen-code-ci-bot · 2026-06-23T23:07:03Z

Thanks for the PR!

Template looks good ✓ — all required sections present with bilingual content.

On direction: clearly aligned. Serve daemon cold-start latency is a real user problem — every extra second before the listener binds is pure overhead for CI/CD pipelines, scripting workflows, and IDE integrations that spawn qwen serve. Claude Code's own CHANGELOG shows multiple startup optimization fixes (120ms regression fix, startup-blocking settings fetch), confirming both projects invest here. The reported p50 improvement from ~3.55s to ~1.25s (65% reduction on dev.js), and our collaborator's built-dist verification showing 4.2× faster (1708ms → 409ms p50), is substantial and user-visible.

On approach: the scope is large (36 files, +6329/-1443) but after examining the diff, the changes form a cohesive unit — every file change serves the goal of breaking the import chain so the fast path avoids loading React/Ink, yargs, the full settings stack, and ACP runtime before the listener binds. The three-pillar structure (fast-path-argv for arg detection/parsing, fast-path-settings for lightweight env/settings loading, fast-path for orchestration) is clean. The deferred runtime pattern in runQwenServe with resolveOnListen is a well-designed extension point.

The fast-path arg parser is a parallel implementation of the yargs-based serve command parsing. Any new --flag added to the serve command now needs updating in both the yargs handler and the fast-path parser. This is a known tradeoff for the performance win — yargs is heavy — and the fallback-to-full-parser pattern means unrecognized flags just degrade to the normal path. The fast-path-settings.ts module (~721 lines) similarly reimplements portions of the environment bootstrapping and trust checking logic. The security-sensitive isWorkspaceTrustedFastPath correctly mirrors the full implementation's trust resolution, including symlink-aware path comparison and the system/user/workspace settings merge order.

Moving on to code review. 🔍

中文说明

感谢贡献！

模板完整 ✓ — 所有必需章节均存在，包含双语内容。

方向：完全对齐。qwen serve 冷启动延迟是真实用户痛点——listener 绑定前每秒开销对 CI/CD、脚本和 IDE 集成场景都是纯浪费。Claude Code 的 CHANGELOG 也有多次启动优化修复，说明两个项目都在积极投入。报告的 p50 从 ~3.55s 降至 ~1.25s（dev.js 上降低 65%），协作者的构建后验证显示 4.2× 提速（1708ms → 409ms p50），是实质性且用户可感知的改进。

方案：范围较大（36 文件，+6329/-1443），但检查 diff 后，所有改动构成内聚整体——每个文件变更都服务于打破 import 链。三柱结构设计清晰。runQwenServe 中通过 resolveOnListen 实现的延迟 runtime 模式是精心设计的扩展点。

fast-path 参数解析器是 yargs 的并行实现，未来新增 flag 需两处同步更新，这是已知的性能权衡。fast-path-settings.ts（~721 行）重新实现了环境引导和信任检查逻辑。安全敏感的 isWorkspaceTrustedFastPath 正确镜像了完整实现的信任解析，包括符号链接感知的路径比较和 system/user/workspace settings 合并顺序。

进入代码审查 🔍

— Qwen Code · qwen3.7-max

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Copilot

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

wenshao · 2026-06-23T23:16:07Z

This PR currently has merge conflicts against main — GitHub reports it as CONFLICTING / DIRTY, so it can't be merged as-is. Could you please rebase onto the latest main, resolve the conflicts, and push the update? Once it's mergeable again we can move forward. Thanks!

中文

这个 PR 目前与 main 分支存在合并冲突（GitHub 显示为 CONFLICTING / DIRTY），当前状态无法合并。能否请你 rebase 到最新的 main、解决冲突后再 push 更新？恢复到可合并状态后我们就能继续推进。谢谢！

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

wenshao

Follow-up review at 4fc1429 (model: claude-opus-4-8). I re-checked the 11 earlier /review findings against current HEAD: the Critical (fast path not observing runtimeReady) and the bridge-proxy prefer-const lint are confirmed fixed, and the env-key / qwen-dir duplication is genuinely deduped. 3 new suggestions inline below; no blockers.

跟进 review（HEAD 4fc1429）：已对照当前 HEAD 复核之前 11 条 /review 意见——Critical（快路径未观察 runtimeReady）与 bridge-proxy prefer-const lint 均已修复，env-key / qwen-dir 重复也已正确去重。下方 3 条新建议，无阻断项。

wenshao

No new high-confidence findings on re-review at 41e268e. Downgraded from Approve to Comment: CI failing (review-pr, Test (windows-latest, Node 22.x), Test (ubuntu-latest, Node 22.x), CodeQL). All 15 R2 findings are low-confidence (terminal-only, not posted as inline comments). R1 inline comments (11) are on stale commits. 125/125 PR tests pass locally; full CLI suite 9,369/9,369 passes.

— qwen3.7-max via Qwen Code /review

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

wenshao · 2026-06-24T00:41:45Z

🔬 Local build & runtime verification — PR #5785 @ `4fc1429`

Built real CLIs from source for both this PR head and origin/main, then measured qwen serve cold-start head-to-head and exercised the daemon end-to-end. This is a runtime/perf verification (complementary to the earlier code reviews), intended as a merge reference.

Environment: macOS arm64 (Darwin 25.5.0), Node v22.22.2. Each worktree: clean npm ci + node scripts/build.js --cli-only (both exit 0). Latency measured on the tsc-built packages/cli/dist/index.js with a fresh temp workspace per run, 2 warm-up + 12 measured runs, sequential on a quiet CPU.

TL;DR — ✅ the startup-perf claim reproduces, no runtime regressions found.

1. Cold-start A/B (built `dist/index.js`, 12 runs each)

Metric	`main` (`a234860`)	PR-5785 (`4fc1429`)	Δ
external cmd → `listening` p50	1708 ms	409 ms	−76% · 4.2× faster
external min / max	1695 / 2041 ms	399 / 425 ms
self-reported `processToListenMs` p50	(field absent)	112 ms	new
self-reported `runQwenServeToListenMs` p50	(field absent)	34 ms	new

Variance is tiny on both sides (PR 399–425 ms). The ratio is even larger than the PR's own dev.js figures because the dist run removes the constant tsx transpile overhead that diluted them — same direction, stronger signal.

2. Functional equivalence (PR head)

stdout contract preserved: qwen serve listening on http://127.0.0.1:<port> (mode=http-bridge, workspace=…)
New observability works — GET /daemon/status now returns a startup block:
```
"startup": { "processStartedAt": "…", "listenerReadyAt": "…",
             "processToListenMs": 159, "runQwenServeToListenMs": 46,
             "preheat": { "status": "running" } }
```
and stderr emits startup timing: processToListenMs=… runQwenServeToListenMs=…. The preheat state confirms ACP runtime is deferred and reported, not blocking listen.
GET /capabilities → HTTP 200.

3. Tests / format

Test-plan suites (fast-path, run-qwen-serve, daemon-status, commands/serve, daemon-logger): 146/146 passed (plan cited 137; head has more, all green).
prettier --check on the listed files: all match. git diff --check: no whitespace errors.

4. Notes for merge

Typecheck blocker is pre-existing & out of scope — confirmed: BaseTextInput.tsx (ink/dom, CursorContext) is identical on main and not touched by this PR. Not a regression.
CI was still pending at verification time (macOS/ubuntu/windows tests + CodeQL running) — let it go green before merge.
Methodology caveat: I measured the tsc dist (unbundled), not the esbuild production bundle — bundling is blocked in this env for both branches alike by @qwen-code/channel-qqbot (not a workspace package; only a lazy dynamic import, never hit by serve) and missing web-templates generated files. The A/B is apples-to-apples; the shipped single-file bundle should be at least as fast.

Verdict: runtime verification PASS — the ~4× faster pre-listen path reproduces cleanly, the new /daemon/status startup telemetry is correct, the stdout listening contract is intact, and the test/format gates are green. The only merge gates are CI completion and the documented pre-existing TUI typecheck blocker.

🇨🇳 中文版（完整对应）

🔬 本地构建与运行时验证 — PR #5785 @ `4fc1429`

对本 PR head 与 origin/main 分别从源码构建出真实 CLI,再对 qwen serve 冷启动做了 A/B 实测,并端到端验证了 daemon。这是一份运行时/性能验证(与此前的代码评审互补),用作合并参考。

环境: macOS arm64(Darwin 25.5.0),Node v22.22.2。每个 worktree:干净 npm ci + node scripts/build.js --cli-only(均 exit 0)。延迟测量基于 tsc 构建出的 packages/cli/dist/index.js,每次用全新临时 workspace,2 次预热 + 12 次正式,安静 CPU 下顺序执行。

结论 — ✅ 启动性能改进可复现,未发现运行时回归。

1. 冷启动 A/B(构建后的 `dist/index.js`,各 12 次)

指标	`main`(`a234860`)	PR-5785(`4fc1429`)	Δ
外部命令 → `listening` p50	1708 ms	409 ms	−76% · 快 4.2×
外部 min / max	1695 / 2041 ms	399 / 425 ms
自报 `processToListenMs` p50	(无此字段)	112 ms	新增
自报 `runQwenServeToListenMs` p50	(无此字段)	34 ms	新增

两侧方差都很小(PR 399–425 ms)。比值比 PR 自己的 dev.js 数据更大,因为 dist 运行去掉了 tsx 转译这一恒定开销(它稀释了原始比值)——方向一致,信号更强。

2. 功能等价性(PR head)

stdout 契约保持不变: qwen serve listening on http://127.0.0.1:<port> (mode=http-bridge, workspace=…)
新观测能力可用 —— GET /daemon/status 现在返回 startup 块:
```
"startup": { "processStartedAt": "…", "listenerReadyAt": "…",
             "processToListenMs": 159, "runQwenServeToListenMs": 46,
             "preheat": { "status": "running" } }
```
且 stderr 输出 startup timing: processToListenMs=… runQwenServeToListenMs=…。preheat 状态印证了 ACP runtime 被延后且被上报,不阻塞 listen。
GET /capabilities → HTTP 200。

3. 测试 / 格式

测试计划里的套件(fast-path、run-qwen-serve、daemon-status、commands/serve、daemon-logger):146/146 通过(计划写的是 137,当前 head 更多,全绿)。
对列出文件 prettier --check:全部符合;git diff --check:无空白错误。

4. 合并注意事项

typecheck 阻塞是既有问题、且不在本 PR 范围内 —— 已确认:BaseTextInput.tsx(ink/dom、CursorContext)在 main 上完全相同,且本 PR 未改动它。不是回归。
CI 在验证时仍pending(macOS/ubuntu/windows 测试 + CodeQL 在跑)—— 合并前等其转绿。
方法学说明: 我测的是 tsc 的 dist(未打包),不是 esbuild 生产 bundle —— 在本环境里 bundle 对两个分支同样被 @qwen-code/channel-qqbot(非 workspace 包;只是个惰性 dynamic import,serve 永不触发)和缺失的 web-templates 生成文件挡住。A/B 是同口径对比;最终单文件 bundle 只会更快或持平。

结论:运行时验证通过 —— pre-listen 路径约 4× 提速可干净复现,新的 /daemon/status 启动遥测正确,stdout listening 契约完好,测试/格式门禁全绿。唯一的合并前置项是 CI 跑完,以及那个有文档记录的、既有的 TUI typecheck 阻塞。

_{Verified locally by building real CLIs from source on both branches; A/B on built dist. — claude-opus-4-8 via Claude Code}

wenshao · 2026-06-24T00:42:52Z

@qwen-code /triage

qwen-code-ci-bot · 2026-06-24T00:56:06Z

Code Review

Independent proposal (before reading diff): To optimize qwen serve cold-start, I would have: (1) made the CLI entry point detect serve in argv before importing the full yargs/React/Ink stack, (2) created a lightweight settings loader that reads only the fields the serve path needs without pulling in the full settings module graph, (3) deferred the ACP runtime initialization until after the HTTP listener binds, and (4) added timing instrumentation so regressions are diagnosable. This is essentially what the PR does.

Comparison with the diff: The PR's approach matches and exceeds this proposal. The three-pillar design (fast-path-argv → fast-path-settings → fast-path) cleanly separates concerns. The resolveOnListen extension in runQwenServe is a well-designed deferred-runtime pattern that keeps the API backward-compatible for embedded callers. The index.ts refactor defers all heavy imports (gemini.tsx, @qwen-code/qwen-code-core) via dynamic import() behind the fast-path check — this is the key architectural change that makes the speedup possible.

Reuse check: The parallel arg parser in fast-path.ts and settings loader in fast-path-settings.ts are inherent to the approach — you can't import the yargs-based serve command or the full settings loader without pulling in the heavy module graph. The storage-paths-lite.ts module is small (~80 lines) and correctly reimplements only the path resolution logic from the full module. No reusable alternative exists.

Security review (isWorkspaceTrustedFastPath): Correctly mirrors the full implementation's trust resolution: reads trusted folders from the same path, handles trust/parent/do_not_trust levels, uses symlink-aware path comparison with realpathSync fallback, and applies the system/user/workspace merge order for the trust-check settings. No bypass vectors found.

No critical blockers found. The architecture is sound, the security-sensitive trust checking is correctly reimplemented, and the fallback-to-full-parser pattern provides a safe degradation path.

Minor: fast-path.test.ts has 3 Prettier formatting issues (function signature wrapping, unnecessary parens, array element wrapping). Not a blocker but should be cleaned up before merge — CI's Prettier gate will catch this.

Tests / Format

 ✓ src/serve/fast-path.test.ts (89 tests) 7641ms
 ✓ src/serve/run-qwen-serve.test.ts (29 tests) 5247ms
 ✓ src/commands/serve.test.ts (9 tests) 672ms
 ✓ src/serve/daemon-logger.test.ts (21 tests) 192ms
 ✓ src/serve/daemon-status.test.ts (5 tests) 13ms

 Test Files  5 passed (5)
      Tests  153 passed (153)

git diff HEAD~1..HEAD --check: no whitespace errors.
prettier --check: 1 file with 3 minor formatting issues in fast-path.test.ts.

Real-Scenario Testing (tmux)

PR branch — `qwen serve --port 0 --hostname 127.0.0.1 --no-open --no-web`

$ npm run dev -- serve --port 0 --hostname 127.0.0.1 --no-open --no-web

> @qwen-code/qwen-code@0.19.1 dev
> node scripts/dev.js serve --port 0 --hostname 127.0.0.1 --no-open --no-web

qwen serve listening on http://127.0.0.1:38997 (mode=http-bridge, workspace=/home/runner/work/qwen-code/qwen-code/.qwen/worktrees/triage)

stderr (new observability):

qwen serve: daemon log → /home/runner/.qwen/debug/daemon/serve-5902-918fec44.log
qwen serve: bound to workspace "/home/runner/work/qwen-code/qwen-code/.qwen/worktrees/triage"
qwen serve: startup timing: processToListenMs=421 runQwenServeToListenMs=57
qwen serve: bearer auth disabled (loopback default). Set QWEN_SERVER_TOKEN to enable.
qwen serve: session reaper started (interval 60000ms, idle threshold 1800000ms)
qwen serve: /acp WebSocket transport enabled on /acp

`/daemon/status` response (new `startup` block)

{
  "v": 1,
  "status": "ok",
  "daemon": {
    "pid": 5902,
    "mode": "http-bridge",
    "startup": {
      "processStartedAt": "2026-06-24T00:54:18.852Z",
      "listenerReadyAt": "2026-06-24T00:54:19.273Z",
      "processToListenMs": 421,
      "runQwenServeToListenMs": 57,
      "preheat": { "status": "succeeded", "durationMs": 3050 }
    }
  }
}

`/capabilities` response

{ "v": 1, "protocolVersions": { "current": "v1", "supported": ["v1"] }, ... }

Verified: stdout contract preserved ✓, new startup timing in stderr ✓, /daemon/status startup block correct ✓, ACP preheat deferred and reported ✓.

(Note: main branch comparison was not possible in this CI environment due to missing dependencies — the collaborator's 12-run A/B on built dist already provides the authoritative comparison: 1708ms → 409ms p50, 4.2× faster.)

中文说明

代码审查

独立方案（读 diff 前）： 为优化 qwen serve 冷启动，我会：(1) 在 CLI 入口检测 serve 参数后再导入完整 yargs/React/Ink 栈，(2) 创建轻量 settings 加载器只读 serve 路径需要的字段，(3) 延迟 ACP runtime 初始化到 HTTP listener 绑定之后，(4) 添加 timing 指标。这基本就是 PR 所做的事。

与 diff 比较： PR 方案匹配并超越了独立方案。三柱设计清晰地分离关注点。runQwenServe 中的 resolveOnListen 扩展是精心设计的延迟 runtime 模式，保持嵌入式调用方的向后兼容性。index.ts 重构将所有重量级导入延迟到 fast-path 检查之后的动态 import() —— 这是实现提速的关键架构变更。

安全检查（isWorkspaceTrustedFastPath）： 正确镜像了完整实现的信任解析：从同一路径读取 trusted folders，处理 trust/parent/do_not_trust 级别，使用符号链接感知的路径比较，并应用正确的 settings 合并顺序。未发现绕过向量。

未发现关键阻塞项。 架构健全，安全敏感的信任检查正确重新实现，fallback 到完整解析器的模式提供了安全的降级路径。

次要问题： fast-path.test.ts 有 3 处 Prettier 格式问题。不是阻塞项但应在合并前修复——CI 的 Prettier 门禁会捕获此问题。

测试 / 格式

全部 153 个测试通过（5 个测试文件）。无空白错误。1 个文件有 3 处次要格式问题。

真实场景测试（tmux）

在 PR 分支上运行 qwen serve --port 0：

stdout 契约保持不变 ✓
stderr 新增 startup timing ✓
/daemon/status 返回正确的 startup 块 ✓
ACP preheat 被延迟且被上报 ✓

（注：由于 CI 环境缺少依赖，无法运行 main 分支对比——协作者的 12 次 A/B 构建后测试已提供权威对比：1708ms → 409ms p50，快 4.2×。）

— Qwen Code · qwen3.7-max

qwen-code-ci-bot · 2026-06-24T00:56:36Z

Stepping back to look at the whole picture: this PR does exactly what it promises. The motivation is clear (serve daemon cold-start is a real pain point), the implementation is well-architected (three-pillar fast-path with graceful fallback), and the results are reproducible — our own tmux test confirmed the listener comes up in ~421ms process-to-listen with the stdout contract fully preserved, and a collaborator's rigorous 12-run A/B on built dist showed 4.2× faster.

Going back to my independent proposal: the PR's approach matches it point-for-point and adds thoughtful details I wouldn't have thought of — the resolveOnListen deferred-runtime pattern for backward compatibility with embedded runQwenServe callers, the daemon_runtime_starting/daemon_runtime_failed states in /daemon/status for the brief window between listener-ready and runtime-mounted, and the startup timing fields that make future regressions diagnosable without code changes.

Is the code straightforward? Yes — each pillar has a clear single responsibility, the fast-path falls back to the full path on any unrecognized input, and there's no speculative abstraction. The parallel arg parser and settings loader are inherent to the approach (you can't import the heavy modules without defeating the purpose), and the ~721-line fast-path-settings.ts is the minimum viable reimplementation of the security-sensitive trust checking and environment bootstrapping.

Is every change necessary? Yes — all 36 files serve the goal. The UI/startup refactor files (gemini.tsx, startInteractiveUI.tsx, theme files) are needed to break the import chain. The test files provide thorough coverage (153 tests across 5 files, all passing). The only non-cosmetic nit is 3 Prettier formatting spots in fast-path.test.ts that CI will catch.

The one thing worth flagging for future maintenance: the dual arg parser means any new --flag on the serve command needs two edits. The fallback-to-full-parser pattern mitigates this (unrecognized flags just degrade gracefully), but a comment in commands/serve.ts pointing to fast-path.ts would help future contributors. Not a blocker.

Verdict: approve. The PR ships a real, measurable user improvement with a clean architecture, thorough tests, and no security concerns.

中文说明

整体审视：这个 PR 完全兑现了承诺。动机清晰（serve daemon 冷启动是真实痛点），实现架构良好（三柱 fast-path 配合优雅降级），结果可复现——我们的 tmux 测试确认 listener 在 ~421ms process-to-listen 内启动，stdout 契约完全保持，协作者的 12 次 A/B 构建后测试显示 4.2× 提速。

回到独立方案：PR 的方案逐点匹配，并添加了我不会想到的精心设计——resolveOnListen 延迟 runtime 模式保持嵌入式 runQwenServe 调用方兼容性，/daemon/status 中 listener-ready 和 runtime-mounted 之间的 daemon_runtime_starting/daemon_runtime_failed 状态，以及让未来回退无需改代码即可诊断的 startup timing 字段。

代码是否简洁直接？是——每个柱有清晰的单一职责，fast-path 对未识别输入降级到完整路径，没有投机性抽象。并行参数解析器和 settings 加载器是方案固有的，~721 行的 fast-path-settings.ts 是安全敏感的信任检查和环境引导的最小可行重新实现。

每个改动都是必要的吗？是——36 个文件都服务于目标。唯一的非装饰性问题是 fast-path.test.ts 中 3 处 Prettier 格式问题，CI 会捕获。

未来维护值得指出的一点：双重参数解析器意味着 serve 命令新增 --flag 需要两处编辑。fallback 到完整解析器的模式缓解了这个问题（未识别的 flag 优雅降级），但在 commands/serve.ts 中添加指向 fast-path.ts 的注释会有帮助。不是阻塞项。

结论：批准。 PR 交付了真实的、可量化的用户改进，架构清晰，测试充分，无安全顾虑。

— Qwen Code · qwen3.7-max

qwen-code-ci-bot

LGTM, looks ready to ship. ✅

wenshao · 2026-06-24T00:59:59Z

+function parsePositiveIntegerEnv(raw: string | undefined): number | undefined {
+  if (raw === undefined || raw === '') return undefined;
+  const trimmed = raw?.trim();
+  if (!trimmed || !/^\d+$/.test(trimmed)) return Number.NaN;


[Critical] parsePositiveIntegerEnv returns Number.NaN (not undefined) for malformed env values such as "abc", "-1", or "1.5". The callers use ??= to apply defaults (e.g., options.rateLimitPrompt ??= parsePositiveIntegerEnv(env['QWEN_SERVE_RATE_LIMIT_PROMPT'])), but NaN is neither null nor undefined, so ??= does NOT short-circuit — it assigns NaN to the option. Downstream, getRateLimitValidationError rejects with "must be a positive integer" and process.exit(1)s, even when the operator expects fallback to the built-in default.

Suggested change

if (!trimmed || !/^\d+$/.test(trimmed)) return Number.NaN;

if (!trimmed || !/^\d+$/.test(trimmed)) return undefined;

And similarly on line 116:

Suggested change

if (!trimmed || !/^\d+$/.test(trimmed)) return Number.NaN;

return Number.isSafeInteger(parsed) && parsed > 0 ? parsed : undefined;

— qwen3.7-max via Qwen Code /review

qwen-code-ci-bot · 2026-06-24T01:13:37Z

Qwen Code review did not complete successfully: Qwen review timed out after 85 minutes. See workflow logs.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

wenshao · 2026-06-24T02:43:12Z

Build failure root cause — it's a `main` regression (SDK bundle over its size cap), not this PR

The failing Test legs on all three platforms (run 28069937514) die at the Install dependencies step. npm ci runs the prepare/build lifecycle (husky && npm run build && npm run bundle), and the build throws:

Error: Browser daemon SDK bundle is 128651 bytes; expected <= 128000
npm error workspace @qwen-code/sdk@0.1.8
npm error command sh -c node scripts/build.js

packages/sdk-typescript/scripts/build.js enforces a hard cap MAX_DAEMON_BROWSER_BUNDLE_BYTES = 125 * 1024 = 128000. The browser daemon bundle is now 128651 bytes — 651 over the limit, so the build aborts on every platform and npm ci fails.

Why this is not your PR:

This PR doesn't touch packages/sdk-typescript at all — your 37 changed files are all under packages/cli/... + docs.
The byte-for-byte identical 128651 bytes failure also hits unrelated PRs — e.g. fix(cli): default to virtualized terminal history #5738 (CLI virtualized terminal history, likewise no SDK changes) produces the exact same number. Identical bundle size across unrelated PRs ⇒ it comes from main, not from either diff.
Historically this cap has been bumped in lockstep with bundle growth: 114 → 116 → 118 → 119 → 122 → 124 → 125 KiB across fix(serve): Add prompt queue backpressure #5033, feat(sdk,serve): DaemonTransport abstraction + ACP standard compliance #5040, feat(daemon): deliver web-shell mid-turn messages into the running turn #5175, feat(web-shell): add extension management #5398, feat(web-shell): support daemon session branching #5613, refactor(web-shell): restructure chat UI #5775. This time the bundle crossed 128000 but MAX_DAEMON_BROWSER_BUNDLE_BYTES was not bumped — so main's own build is red (a recent main run also fails the build step, e.g. 28066172473, there on a separate tsc --build error).

The fix belongs on main, not in this PR: either trim the daemon browser bundle back under 128000, or bump MAX_DAEMON_BROWSER_BUNDLE_BYTES (e.g. to 126 * 1024) in packages/sdk-typescript/scripts/build.js. Once main is green again, rebasing this PR will clear the failure — there is nothing to change here for it.

中文

构建失败根因 —— 是 `main` 分支的回归（SDK bundle 超过体积上限），不是本 PR

三个平台失败的 Test 都挂在 Install dependencies 步骤（run 28069937514）。npm ci 会跑 prepare/build 生命周期（husky && npm run build && npm run bundle），构建抛错：

Error: Browser daemon SDK bundle is 128651 bytes; expected <= 128000
npm error workspace @qwen-code/sdk@0.1.8
npm error command sh -c node scripts/build.js

packages/sdk-typescript/scripts/build.js 里有硬上限 MAX_DAEMON_BROWSER_BUNDLE_BYTES = 125 * 1024 = 128000。现在 browser daemon bundle 是 128651 字节，超了 651 字节，于是构建在每个平台都中止，npm ci 失败。

为什么判定与本 PR 无关：

本 PR 完全没碰 packages/sdk-typescript —— 你改的 37 个文件都在 packages/cli/... 和 docs 下。
这个一模一样的 128651 字节 失败在无关 PR 上也会出现 —— 比如 fix(cli): default to virtualized terminal history #5738（CLI 虚拟化终端历史，同样没动 SDK）产出的字节数完全相同。无关 PR 间 bundle 体积一致 ⇒ 它来自 main，不是各自的 diff。
这个上限历来是跟着 bundle 增长同步上调的：114 → 116 → 118 → 119 → 122 → 124 → 125 KiB（对应 fix(serve): Add prompt queue backpressure #5033、feat(sdk,serve): DaemonTransport abstraction + ACP standard compliance #5040、feat(daemon): deliver web-shell mid-turn messages into the running turn #5175、feat(web-shell): add extension management #5398、feat(web-shell): support daemon session branching #5613、refactor(web-shell): restructure chat UI #5775）。这次 bundle 越过了 128000，但 MAX_DAEMON_BROWSER_BUNDLE_BYTES 没有同步上调 —— 所以 main 自己的构建就是红的（最近一次 main 运行的构建步骤也失败，例如 28066172473，不过那次是另一个 tsc --build 错误）。

修复应该在 main，不在本 PR： 要么把 daemon browser bundle 压回 128000 以内，要么在 packages/sdk-typescript/scripts/build.js 里上调 MAX_DAEMON_BROWSER_BUNDLE_BYTES（例如调到 126 * 1024）。等 main 重新变绿后，把本 PR rebase 一下就能消除这个失败 —— 本 PR 自身不需要改动。

wenshao · 2026-06-24T03:25:27Z

+  }
+}
+
+export function setUpCloudShellEnvironment(envFilePath: string | null): void {


[Suggestion] setUpCloudShellEnvironment (the single-file-path variant) appears to be dead code. It is exported and re-exported from settings.ts, but grep across the entire packages/ tree finds zero call sites. The active variant is setUpCloudShellEnvironmentFromFiles (multi-file). Consider removing this dead export or marking it @deprecated if backward compatibility is needed for external consumers.

— qwen3.7-max via Qwen Code /review

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

qwen-code-ci-bot · 2026-06-24T04:12:46Z

Qwen Code review did not complete successfully: Qwen review timed out after 85 minutes. See workflow logs.

wenshao

Follow-up review of the latest changes (HEAD 095188d0) with claude-opus-4-8. The earlier Critical findings look addressed in 97bb0d29/095188d0 (fast path now awaits runtimeReady via waitForServeRuntimeOrExit; --experimental-lsp parsed; prefer-const lint clear; path-helper dedup into path-comparison.ts; trust-rule caching). Locally ESLint, tsc, and the serve/config test suites are all green here — the red Test legs appear to be the known unrelated main SDK-bundle-size regression rather than this PR. Two minor follow-up notes inline.

— claude-opus-4-8[1m] via Qwen Code /qreview

wenshao

No review findings on this incremental commit. Downgraded from Approve to Comment: CI failing (Test (macos-latest, Node 22.x), Test (windows-latest, Node 22.x), Test (ubuntu-latest, Node 22.x)).

R5 (095188d) addresses 4 prior R4 suggestions:

failRuntimeStartup now writes to daemonLog.error() (observability)
resolveRuntimeStartupTimeoutMs env var parsing is now tested (6 parametrized cases)
Bootstrap catch-all 503 handler for unknown routes is now tested
Rate limit env var activation and discard paths are now tested
Trust-rule evaluation refactored to pre-compute path variants (CachedTrustRule), avoiding redundant getPathComparisonVariants/fs.realpathSync calls per check

Two low-confidence suggestions for human review (terminal-only):

buildTrustedFolderRules TRUST_PARENT branch has no fast-path-specific test (logic unchanged from pre-existing code tested in trustedFolders.test.ts)
resolveRuntimeStartupTimeoutMs direct-override branches (positive, zero, negative, Infinity, NaN) are untested — only the env-var fallback path is covered

— qwen3.7-max via Qwen Code /review

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

qwen-code-ci-bot

CI failures (Test on all platforms, CodeQL) are pre-existing — root cause is the SDK bundle exceeding its 128KB size cap on main, unrelated to this PR. ESLint and TypeScript pass cleanly; all 165 PR-specific tests pass.

— qwen3.7-max via Qwen Code /review

qwen-code-ci-bot · 2026-06-24T06:06:04Z

-            }
-            forceFlushMetrics()
+            (coreRuntimePromise
+              ? coreRuntimePromise.then((core) => core.forceFlushMetrics())


[Critical] The shutdown chain gates the force-close timer (line 2196) and all subsequent cleanup on coreRuntimePromise.then(core => core.forceFlushMetrics()). If the dynamic import('@qwen-code/qwen-code-core') stalls — realistic on cold NFS, partial deploys, or ESM resolution issues — this promise never resolves, the .catch().then() chain blocks, and the force-close timer is never armed. The daemon becomes a zombie: TCP listener still open, bootstrap /health returning 200, but every real route returning 503. Signal handlers (SIGINT/SIGTERM) enter this same blocked close promise, so they also hang.

The runtimeStartupTimeoutMs (120s) correctly rejects runtimeStarting, triggering waitForServeRuntimeOrExit → handle.close() — but handle.close() enters this same blocked chain, so the exit never happens.

Suggested change

? coreRuntimePromise.then((core) => core.forceFlushMetrics())

// Fire-and-forget: never block shutdown on core being loaded.

void (coreRuntimePromise

? coreRuntimePromise.then((core) => core.forceFlushMetrics())

: Promise.resolve()

).catch((flushErr) => {

daemonLog.warn(

`pre-shutdown metrics flush failed: ${

flushErr instanceof Error ? flushErr.message : String(flushErr)

}`,

);

});

// Proceed immediately to runtime wait + cleanup + force timer.

Promise.resolve()

.then(async () => {

await waitForRuntimeStartingForShutdown(runtimeStarting, daemonLog);

— qwen3.7-max via Qwen Code /review

doudouOUC and others added 3 commits June 24, 2026 01:30

chore: merge main into daemon startup PR

14473c4

Resolve runQwenServe conflicts after upstream serve and LSP changes while preserving the fast-path runtime split. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

test(cli): fix Windows serve fast-path CI

b40d4d8

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

wenshao requested changes Jun 23, 2026

View reviewed changes

Comment thread packages/cli/src/serve/fast-path.ts

Comment thread packages/cli/src/serve/fast-path-settings.ts Outdated

Comment thread packages/cli/src/serve/run-qwen-serve.ts

Comment thread packages/cli/src/serve/run-qwen-serve.ts

Comment thread packages/cli/index.ts

fix(cli): handle serve fast path runtime failures

5c17147

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

wenshao reviewed Jun 23, 2026

View reviewed changes

doudouOUC and others added 2 commits June 24, 2026 06:44

test(cli): cover serve startup edge cases

cf33bba

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

test(cli): cover resolve-on-listen failure cleanup

41e268e

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

wenshao marked this pull request as ready for review June 23, 2026 23:03

test(cli): satisfy bridge proxy lint

4a10488

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

merge: sync main into daemon startup fast path

891209d

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Copilot AI review requested due to automatic review settings June 23, 2026 23:15

Copilot AI reviewed Jun 23, 2026

docs(cli): note serve fast path parser sync

4fc1429

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

wenshao reviewed Jun 23, 2026

View reviewed changes

Comment thread packages/cli/src/serve/fast-path-settings.ts Outdated

Comment thread packages/cli/src/serve/fast-path.ts Outdated

Comment thread packages/cli/src/serve/run-qwen-serve.ts

wenshao reviewed Jun 23, 2026

View reviewed changes

wenshao approved these changes Jun 23, 2026

View reviewed changes

wenshao requested changes Jun 23, 2026

View reviewed changes

Comment thread packages/cli/src/serve/run-qwen-serve.test.ts Outdated

Comment thread packages/cli/src/serve/fast-path-settings.ts

Comment thread packages/cli/src/serve/fast-path-settings.ts Outdated

Comment thread packages/cli/src/serve/fast-path.ts

doudouOUC requested review from qwen-code-dev-bot and tanzhenxin June 23, 2026 23:44

wenshao requested changes Jun 23, 2026

View reviewed changes

Comment thread packages/cli/src/serve/fast-path-settings.ts Outdated

Comment thread packages/cli/src/serve/fast-path.ts

doudouOUC and others added 2 commits June 24, 2026 08:05

fix(cli): address serve fast path review feedback

0f59c13

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

test(cli): guard serve fast path option parity

278581e

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

doudouOUC requested a review from wenshao June 24, 2026 00:33

wenshao approved these changes Jun 24, 2026

View reviewed changes

qwen-code-ci-bot previously approved these changes Jun 24, 2026

View reviewed changes

wenshao requested changes Jun 24, 2026

View reviewed changes

doudouOUC and others added 2 commits June 24, 2026 09:54

codex: address PR review feedback (QwenLM#5785)

97bb0d2

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

merge: sync origin/main into daemon startup fast path

79fb09b

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

doudouOUC dismissed qwen-code-ci-bot’s stale review via 79fb09b June 24, 2026 01:59

This was referenced Jun 24, 2026

fix(cli): default to virtualized terminal history #5738

Open

fix(cli): restore saved custom model IDs when re-entering the auth wizard #5654

Merged

wenshao reviewed Jun 24, 2026

View reviewed changes

codex: address PR review feedback (QwenLM#5785)

095188d

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

wenshao reviewed Jun 24, 2026

View reviewed changes

Comment thread packages/cli/src/serve/fast-path-settings.ts

Comment thread packages/cli/src/serve/fast-path.ts

wenshao reviewed Jun 24, 2026

View reviewed changes

codex: address PR review feedback (QwenLM#5785)

677d6a9

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

wenshao approved these changes Jun 24, 2026

View reviewed changes

qwen-code-ci-bot requested changes Jun 24, 2026

View reviewed changes

	if (!trimmed \|\| !/^\d+$/.test(trimmed)) return Number.NaN;
	if (!trimmed \|\| !/^\d+$/.test(trimmed)) return undefined;

	if (!trimmed \|\| !/^\d+$/.test(trimmed)) return Number.NaN;
	return Number.isSafeInteger(parsed) && parsed > 0 ? parsed : undefined;

-              ? coreRuntimePromise.then((core) => core.forceFlushMetrics())
+// Fire-and-forget: never block shutdown on core being loaded.
+void (coreRuntimePromise
+  ? coreRuntimePromise.then((core) => core.forceFlushMetrics())
+  : Promise.resolve()
+).catch((flushErr) => {
+  daemonLog.warn(
+    `pre-shutdown metrics flush failed: ${
+      flushErr instanceof Error ? flushErr.message : String(flushErr)
+    }`,
+  );
+});
+// Proceed immediately to runtime wait + cleanup + force timer.
+Promise.resolve()
+  .then(async () => {
+    await waitForRuntimeStartingForShutdown(runtimeStarting, daemonLog);

Conversation

doudouOUC commented Jun 23, 2026

What this PR does

Why it's needed

Reviewer Test Plan

How to verify

Evidence (Before & After)

Tested on

Environment (optional)

Risk & Scope

Linked Issues

What this PR does

Why it's needed

Reviewer Test Plan

How to verify

Evidence (Before & After)

Tested on

Environment (optional)

Risk & Scope

Linked Issues

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qwen-code-ci-bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

wenshao commented Jun 23, 2026

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wenshao commented Jun 24, 2026

🔬 Local build & runtime verification — PR #5785 @ 4fc1429

TL;DR — ✅ the startup-perf claim reproduces, no runtime regressions found.

1. Cold-start A/B (built dist/index.js, 12 runs each)

2. Functional equivalence (PR head)

3. Tests / format

4. Notes for merge

🔬 本地构建与运行时验证 — PR #5785 @ 4fc1429

结论 — ✅ 启动性能改进可复现,未发现运行时回归。

1. 冷启动 A/B(构建后的 dist/index.js,各 12 次)

2. 功能等价性(PR head)

3. 测试 / 格式

4. 合并注意事项

Uh oh!

wenshao commented Jun 24, 2026

Uh oh!

qwen-code-ci-bot commented Jun 24, 2026

Code Review

Tests / Format

Real-Scenario Testing (tmux)

PR branch — qwen serve --port 0 --hostname 127.0.0.1 --no-open --no-web

/daemon/status response (new startup block)

/capabilities response

代码审查

测试 / 格式

qwen-code-ci-bot commented Jun 23, 2026 •

edited

Loading

🔬 Local build & runtime verification — PR #5785 @ `4fc1429`

1. Cold-start A/B (built `dist/index.js`, 12 runs each)

🔬 本地构建与运行时验证 — PR #5785 @ `4fc1429`

1. 冷启动 A/B(构建后的 `dist/index.js`,各 12 次)

PR branch — `qwen serve --port 0 --hostname 127.0.0.1 --no-open --no-web`

`/daemon/status` response (new `startup` block)

`/capabilities` response

Build failure root cause — it's a `main` regression (SDK bundle over its size cap), not this PR

构建失败根因 —— 是 `main` 分支的回归（SDK bundle 超过体积上限），不是本 PR