Perf worker#6861
Conversation
Coverage Report
File CoverageNo changed files found. |
|
✅ Docs Preview Deployed! 🔗 👀 Click here to visit preview |
There was a problem hiding this comment.
Pull request overview
Introduces a configurable, long-lived worker pool for file parsing / HTML→Markdown / text chunking to improve performance and stability under concurrency, alongside related env/template/documentation updates.
Changes:
- Add per-task timeout and per-worker recycle thresholds to the worker pool, and switch key worker entrypoints to an
id-based request/response protocol. - Add new env knobs for worker pool sizing/timeouts and reorganize
.env.template/ docker-compose env blocks accordingly. - Add Vitest coverage for worker dispatch + a “real spawn” readFile integration test (build-artifact dependent).
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| projects/app/.env.template | Adds worker concurrency/timeout envs and reorganizes template sections. |
| packages/service/worker/utils.ts | Enhances WorkerPool with task timeout + maxTasksPerWorker recycling; extends controller props. |
| packages/service/worker/text2Chunks/index.ts | Updates worker thread protocol to include id and explicit success/error messages. |
| packages/service/worker/readFile/index.ts | Updates readFile worker to id protocol and simplifies message handling. |
| packages/service/worker/htmlStr2Md/index.ts | Updates htmlStr2Md worker to id protocol with try/catch response. |
| packages/service/worker/function.ts | Switches to pooled worker controller usage and adds env-configured pool sizing/timeouts. |
| packages/service/test/worker/readFile/integration.test.ts | Adds build-artifact-based real worker spawn integration tests for readFile pool behavior. |
| packages/service/test/worker/function.test.ts | Adds unit tests for worker function dispatch/config wiring and SharedArrayBuffer behavior. |
| packages/service/env.ts | Adds/reshapes env schema, including new worker pool envs and many defaults. |
| packages/service/common/string/utils.ts | Switches htmlToMarkdown to use worker pool controller with env-configured sizing/timeouts. |
| packages/global/common/system/types/index.ts | Minor comment update on tokenWorkers constraints. |
| document/data/doc-last-modified.json | Updates doc timestamps for newly/edited docs. |
| document/content/self-host/upgrading/4-15/4150.mdx | Documents new worker pool env variables and release notes line edits. |
| deploy/templates/docker-compose.prod.yml | Updates SYNC_INDEX format and removes some env entries (now defaulted elsewhere) + template reference comment. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
✅ Build Successful - Preview fastgpt Image for this PR: |
|
✅ Build Successful - Preview mcp_server Image for this PR: |
|
✅ Build Successful - Preview code-sandbox Image for this PR: |
|
✅ Admin Preview Image Ready! |
c121914yu
left a comment
There was a problem hiding this comment.
PR Review: Perf worker
📋 需求理解
本 PR 主要把文件解析、HTML 转 Markdown、文本切块改为复用 WorkerPool,并补充可配置 worker 数量/超时;同时扩展 packages/service/env.ts 对更多环境变量做 schema 管理,并调整部署模板与文档。
🧪 逻辑验证
我重点验证了以下路径:
- 文件解析并发复用 worker:新增 mock 单测覆盖了
readRawContentFromBuffer的 SharedArrayBuffer 包装和 pool 配置,定向测试通过。 - HTML 转 Markdown:调用已从
runWorker改为getWorkerController,mock 单测覆盖 null/正常 HTML,定向测试通过。 - 真实 readFile worker:新增 integration 测试依赖
projects/app/worker/readFile.js,当前 worktree 未 build,所以该组被 skipped。 - 环境变量兼容:发现
env.ts的 schema/default 改动没有完全同步到仍直接读取process.env的运行时代码,存在回归。
⚠️ 问题汇总
🔴 严重问题(2 个,必须修复)
FILE_TOKEN_KEY不应给公开默认值,否则未配置密钥的实例会用固定 JWT secret 签发/校验文件 token。SYNC_INDEX模板改成布尔值后,现有 Mongo 初始化仍只识别字符串'0',导致用户设置false仍会同步索引。
🟡 建议改进(0 个)
🟢 可选优化(0 个)
🚀 审查结论
需修改。因为我是该 PR 作者账号,GitHub 不允许 request changes;这里以 comment review 提交阻塞意见。
c121914yu
left a comment
There was a problem hiding this comment.
PR Review: Perf worker 复审
上次两条阻塞点本体已经修复:
FILE_TOKEN_KEY已移除公开默认值,改为必填。- 主应用 Mongo 索引同步已经改为读取
env.SYNC_INDEX,SYNC_INDEX=false不会再被当成启用。
复审仍发现 2 个需要继续处理的问题:
🔴 projects/marketplace/src/service/mongo/index.ts 为了读 SYNC_INDEX 引入 @fastgpt/service/env,把主应用必填密钥也带进 marketplace。已在行级评论标出。
🔴 packages/service/common/secret/constants.ts 仍是:
export const AES256_SECRET_KEY = process.env.AES256_SECRET_KEY || 'fastgptkey';AES256_SECRET_KEY 虽然在 env.ts 里变成必填,但实际加解密仍保留公开默认值。任何只走 secret 工具、没有先触发 env.ts 校验的路径,仍可能用固定密钥加解密。建议改为从 env.AES256_SECRET_KEY 读取,让缺配置 fail closed。
验证:
corepack pnpm -C packages/service exec vitest run -c vitest.config.ts test/common/s3/token.test.ts test/common/secret/aes256gcm.test.ts test/worker/function.test.ts通过,38 passed。env -u FILE_TOKEN_KEY -u AES256_SECRET_KEY corepack pnpm --filter @fastgpt/marketplace build被当前分支另一个 TypeScript 错误拦截:reasoningEffortList里的'none'不匹配ChatCompletionReasoningEffort,未能走到 env 验证。
| import type { Model, Schema } from 'mongoose'; | ||
| import { Mongoose } from 'mongoose'; | ||
| import { getLogger, LogCategories } from '../logger'; | ||
| import { env } from '@fastgpt/service/env'; |
There was a problem hiding this comment.
🔴 这里不应该直接引入 @fastgpt/service/env。env.ts 现在会强校验主应用的 FILE_TOKEN_KEY / AES256_SECRET_KEY,但 marketplace 是独立项目,projects/marketplace/.env.template 只声明了 S3_PREFIX、AUTH_TOKEN、MONGODB_URI,Dockerfile 也没有注入这些 FastGPT app 密钥。结果是 marketplace 仅为了读 SYNC_INDEX 就被迫配置主应用文件/AES 密钥,构建或运行到该模块时会因为无关 env 缺失而失败。建议在 marketplace 本地解析 SYNC_INDEX,或拆出不依赖主应用密钥的共享 bool/env helper。
|
CI 里 失败日志关键栈: 需要把 |
c121914yu
left a comment
There was a problem hiding this comment.
复审了一遍,前面几处核心问题已经修掉:
FILE_TOKEN_KEY/AES256_SECRET_KEY改成必填后,测试配置已覆盖 service/app/admin/root vitest 配置。AES256_SECRET_KEY的实际加解密路径已改为走env.AES256_SECRET_KEY,旧的fastgptkeyfallback 已删除。- Marketplace 不再为了
SYNC_INDEX导入完整 service env。 - Next production build 阶段通过
NEXT_PHASE=phase-production-build跳过 env validation,运行期仍会强校验密钥。
还剩 1 个需要修的部署面问题:Helm chart 的默认 Secret 还没有补 AES256_SECRET_KEY。deploy/helm/fastgpt/templates/secret-env.yaml 目前只有 FILE_TOKEN_KEY: "filetoken",但本 PR 里 packages/service/env.ts 已经要求 AES256_SECRET_KEY 必填。用 Helm chart 安装出来的 FastGPT Pod 启动时会直接报 Invalid environment variables. Please check: AES256_SECRET_KEY。
建议在 deploy/helm/fastgpt/templates/secret-env.yaml 补上 AES256_SECRET_KEY,最好也和 compose/template 文档保持一致,让 Helm 路径不会被这次配置收紧打断。
验证:
corepack pnpm -C packages/service exec vitest run -c vitest.config.ts test/common/s3/token.test.ts test/common/secret/aes256gcm.test.ts test/worker/function.test.ts
# 3 files / 38 tests passed
NEXT_PHASE=phase-production-build 且无 FILE_TOKEN_KEY/AES256_SECRET_KEY 时 import packages/service/env 通过
普通运行期无 FILE_TOKEN_KEY/AES256_SECRET_KEY 时仍按预期报错
当前 GitHub checks 还有 pending,我没有等全部完成。
No description provided.