【Hackathon 10th Spring No.46】Add Windows platform guards for Python runtime (Part 3/3)#7503
Conversation
…on runtime (Part 3/3)
|
Thanks for your contribution! |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-20 14:06 CST\n\n## 📋 Review 摘要\n\nPR 概述:为 FastDeploy Python 运行时模块添加 Windows 平台兼容守卫,替换 POSIX 专属原语(/dev/shm、os.setsid、os.killpg、fork)。\n变更范围:engine/、cache_manager/、inter_communicator/、worker/、eplb/\n影响面 Tag:EngineKVCache\n\n### 📝 PR 规范检查\n\nPR 标题中的[Build]不在官方 Tag 列表中。此 PR 主要是为 Windows 运行时做平台兼容适配,建议使用[Feature]或[Others]。\n\n标题建议(可直接复制):\n-[Feature] Add Windows Python runtime guards for platform compatibility\n\n### 问题\n\n| 级别 | 文件 | 概述 |\n|------|------|------|\n| 🟡 建议 | 多文件 |/dev/shm替换覆盖不完整,遗漏约 8 个文件 |\n| 🟡 建议 |common_engine.py|_shm_base三元表达式重复 10+ 次,建议提取公共函数 |\n| 🟡 建议 |common_engine.py| Windows 下p.terminate()+shell=True无法杀死子进程树 |\n| ❓ 疑问 |engine.py|spawn上下文要求所有参数可序列化,需确认兼容性 |\n\n### 总体评价\n\n变更方向正确,守卫逻辑清晰且不影响 Linux 行为。主要问题是覆盖范围不完整(仍有 ~8 个文件遗留硬编码/dev/shm),以及重复代码较多建议提取公共工具函数。建议补全遗漏文件后合入。"
| address = (pod_ip, engine_worker_queue_port) | ||
| else: | ||
| address = f"/dev/shm/fd_task_queue_{engine_worker_queue_port}.sock" | ||
| _shm_base = "/dev/shm" if sys.platform != "win32" else tempfile.gettempdir() |
There was a problem hiding this comment.
🟡 建议 /dev/shm 替换覆盖不完整
通过全仓搜索发现以下文件仍有硬编码 /dev/shm,未被本 PR 覆盖:
fastdeploy/entrypoints/openai/utils.py(2 处:L117, L143)fastdeploy/cache_manager/multimodal_cache_manager.py(1 处:L130)fastdeploy/input/multimodal_processor.pyfastdeploy/input/ernie4_5_vl_processor/process.pyfastdeploy/input/qwen_vl_processor/process.pyfastdeploy/input/qwen3_vl_processor/process.py(2 处)fastdeploy/input/paddleocr_vl_processor/process.py
建议在本 PR 或后续 PR 中补齐,否则 Windows 上运行到这些路径时仍会失败。
| else: | ||
| address = f"/dev/shm/fd_task_queue_{self.cfg.parallel_config.local_engine_worker_queue_port}.sock" | ||
| # Shared-memory base: /dev/shm on Linux, tempdir on Windows | ||
| _shm_base = "/dev/shm" if sys.platform != "win32" else tempfile.gettempdir() |
There was a problem hiding this comment.
🟡 建议 _shm_base 三元表达式在本 PR 中重复了 10+ 次,建议提取为公共工具函数
可以在 fastdeploy/utils/ 或 fastdeploy/envs.py 中定义一个常量或函数,例如:
# fastdeploy/utils/platform_compat.py
import sys, tempfile
SHM_BASE = "/dev/shm" if sys.platform != "win32" else tempfile.gettempdir()各处直接 from fastdeploy.utils.platform_compat import SHM_BASE 即可,维护成本更低,也更容易未来扩展(如支持自定义路径)。
| pgid = os.getpgid(self.worker_proc.pid) | ||
| os.killpg(pgid, signal.SIGTERM) | ||
| else: | ||
| self.worker_proc.terminate() |
There was a problem hiding this comment.
🟡 建议 Windows 下 p.terminate() 配合 shell=True 可能无法杀死实际子进程
当使用 shell=True 启动子进程时,subprocess.Popen 实际创建的是 cmd.exe(Windows)或 /bin/sh(Linux)作为中间进程。在 Windows 上,p.terminate() 只会终止 cmd.exe,而不会终止其子进程树,可能导致孤儿进程。
建议考虑使用 subprocess.Popen 的 creationflags=subprocess.CREATE_NEW_PROCESS_GROUP 配合 os.kill(p.pid, signal.CTRL_BREAK_EVENT),或使用 taskkill /F /T /PID 来终止整个进程树:
if sys.platform == "win32":
subprocess.call(["taskkill", "/F", "/T", "/PID", str(p.pid)])
else:
pgid = os.getpgid(p.pid)
os.killpg(pgid, signal.SIGTERM)| ) | ||
| ctx = multiprocessing.get_context("fork") | ||
| # Windows: "spawn" required since fork is unavailable | ||
| ctx = multiprocessing.get_context("spawn" if sys.platform == "win32" else "fork") |
There was a problem hiding this comment.
❓ 疑问 spawn 上下文要求 Process(target=..., args=(...)) 中的所有参数必须可 pickle 序列化
与 fork 不同,spawn 不会继承父进程内存,而是在新进程中重新 import 模块并反序列化参数。请确认 start_data_parallel_service 函数及其传入的 cfg 对象(虽然已 deepcopy)在 pickle 时不会失败(例如包含不可序列化的锁、文件句柄、CUDA context 等)。
如果未来 Windows 真正运行到这条路径,这可能导致 spawn 启动时的 PicklingError。
Motivation
Windows lacks POSIX primitives (
os.setsid,os.killpg,os.fork,/dev/shm). This PR adds platform-conditional guards so FastDeploy's Python runtime modules work on both Linux and Windows without breaking existing Linux behaviour.Modifications
/dev/shmwithtempfile.gettempdir()on Windows via_shm_basehelper variable incache_messager.py,common_engine.py,engine.py,async_expert_loader.py,fmq.py,zmq_client.py,zmq_server.py,worker_process.pyos.killpg(os.getpgid(...))withsys.platformcheck, falling back top.terminate()on Windows incommon_engine.py,engine.py,expert_service.pypreexec_fn=os.setsidon Windows via conditional kwargs incommon_engine.py,engine.py,prefix_cache_manager.py"spawn"instead of"fork"on Windows inengine.pyUsage or Command
No API changes. Guards activate automatically when
sys.platform == "win32".Accuracy Tests
No behavioural change on Linux — all guards are behind
sys.platform == "win32"checks.Checklist