[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend.py unit test#7494
[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend.py unit test#7494bobby-cloudforge wants to merge 1 commit intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
8de7a8d to
c849133
Compare
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-20 05:01 CST\n\n## 📋 Review 摘要\n\nPR 概述:为fused_moe_marlin_backend.py新增单元测试,覆盖纯函数(get_scale_perms、marlin_permute_scales等)和MarlinWeightOnlyMoEMethod的create_weights/process_loaded_weights/apply方法。\n变更范围:tests/layers/(新增测试文件)\n影响面 Tag:CIOP\n\n### 问题\n\n| 级别 | 文件 | 概述 |\n|------|------|------|\n| 🟡 建议 |test_fused_moe_marlin_backend.py:38|_NEED_STUB机制在 GPU 可用时,per-test mock 可能失效 |\n| 🟡 建议 |test_fused_moe_marlin_backend.py:283|test_apply_topk和test_apply_noaux_tc大量重复 mock 配置,建议抽取公共 fixture |\n\n### 总体评价\n\n测试文件整体结构清晰,_DummyLayer准确覆盖了MarlinWeightOnlyMoEMethod实际访问的所有 layer 属性,纯函数测试部分(TestPureFunctions)验证了真实计算逻辑。主要改进点在于两个apply测试之间的 mock 配置重复度较高,以及 stub 机制在有 GPU 环境下的鲁棒性可以加强。"
| _GPU_OPS = "fastdeploy.model_executor.ops.gpu" | ||
| _DEEP_GEMM = f"{_GPU_OPS}.deep_gemm" | ||
|
|
||
| _NEED_STUB = _GPU_OPS not in sys.modules |
There was a problem hiding this comment.
🟡 建议 当 GPU ops 可用时(_NEED_STUB = False),mb 模块在顶层通过真实模块导入,内部持有的 gpu_ops 引用指向真实模块对象。但各测试方法中 patch.object(_gpu_ops_stub, "gptq_marlin_repack", ...) 修改的是 stub 对象而非真实模块,mock 不会生效,测试可能直接调用真实 GPU 算子而失败。
建议改进方案:在 _NEED_STUB = False 分支下,将 _gpu_ops_stub 指向真实模块,使 per-test 的 patch.object 始终作用于正确的目标:
if _NEED_STUB:
# ... 现有逻辑 ...
else:
from fastdeploy.model_executor.layers.moe import fused_moe_marlin_backend as mb
_gpu_ops_stub = sys.modules[_GPU_OPS] # 使 patch.object 目标一致| paddle.zeros([g.shape[0], k], "int64"), | ||
| ) | ||
|
|
||
| with ( |
There was a problem hiding this comment.
🟡 建议 test_apply_topk(L219-260)和 test_apply_noaux_tc(L283-320)共享 5 组几乎相同的 patch 配置(sys.modules、gptq_marlin_repack、MoeWna16MarlinGemmApi、tritonmoe_preprocess_func、swiglu),代码重复度较高。
同目录下的兄弟测试文件(test_fused_moe_cutlass_backend.py、test_fused_moe_triton_backend.py)使用 pytest monkeypatch fixture 管理 mock。建议将公共 mock 抽取为 pytest fixture 或 contextmanager 辅助函数,例如:
from contextlib import contextmanager
@contextmanager
def _mock_gpu_ops(extra_modules=None):
modules = {_GPU_OPS: _gpu_ops_stub, _DEEP_GEMM: _deep_gemm_stub}
if extra_modules:
modules.update(extra_modules)
with (
patch.dict(sys.modules, modules, clear=False),
patch.object(_gpu_ops_stub, "gptq_marlin_repack",
lambda w, p, sk, sn, nb: paddle.zeros([sk // 16, sn * (nb // 2)], dtype=w.dtype)),
patch.object(mb, "MoeWna16MarlinGemmApi",
lambda *_a, **kw: (paddle.zeros([kw["size_m"], kw["size_n"]], "float32"),)),
patch.object(mb, "tritonmoe_preprocess_func",
lambda ids, ne, bm: (paddle.zeros([4], "int32"), paddle.zeros([1], "int32"), paddle.to_tensor([4], "int32"))),
patch("paddle.incubate.nn.functional.swiglu", lambda x: x[..., : x.shape[-1] // 2], create=True),
):
yield这样每个测试方法只需额外添加各自特有的 mock(如 moe_topk_select 或 _moe_stub),可读性和可维护性都会提升。
Motivation
No.39 功能模块 fastdeploy/model_executor/layers/moe/fused_moe_marlin_backend.py 单元测试覆盖
Modifications
添加单测文件 tests/layers/test_fused_moe_marlin_backend.py
develop 分支:覆盖率0%,Miss行数115(17-361)
当前PR:覆盖率100%,Miss行数0
覆盖行数增量 115-0 = 115 → 四舍五入 100 → 预估贡献 0.1⭐
Usage or Command
Accuracy Tests
N/A
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.