[Feature]【Hackathon 10th Spring No.47】Add MiniMax-M1 integration tests and multi-GPU support by bobby-cloudforge · Pull Request #7511 · PaddlePaddle/FastDeploy

bobby-cloudforge · 2026-04-20T06:56:12Z

Motivation

Companion to the MiniMax-M1 model PR — adds integration tests and multi-GPU validation infrastructure for Hackathon 10th Spring No.47.

Modifications

Integration Tests (`tests/model_executor/test_minimax_m1_integration.py`)

End-to-end construction + forward pass tests with full model config
Multi-layer interaction tests (linear + full attention)
Weight loading validation (v0 and v1 paths)

Multi-GPU Validation Script (`scripts/validate_minimax_m1_multigpu.sh`)

Automated tensor-parallel validation script for 2/4/8 GPU configurations
Includes correctness checks and basic throughput measurement

Test Infrastructure (`tests/model_executor/conftest.py`)

Shared fixtures for model executor tests
Config builder helpers for MiniMax-M1 test variants

Model Base Extension (`fastdeploy/model_executor/models/model_base.py`)

Minor extension to support MiniMax-M1 linear attention state management

Usage or Command

# Run integration tests
pytest tests/model_executor/test_minimax_m1_integration.py -v

# Multi-GPU validation (requires 8 GPUs)
bash scripts/validate_minimax_m1_multigpu.sh

Accuracy Tests

Integration tests verify:

Model construction with correct layer type dispatch (linear vs full attention)
Forward pass shape correctness through mixed attention pipeline
Weight loading key mapping for both v0 and v1 loaders
DeepNorm scaling coefficients applied correctly

All tests use monkeypatch.setattr + real objects (no MagicMock).

Checklist

Integration tests for mixed attention pipeline
Multi-GPU validation script
Shared test fixtures
Pre-commit hooks passing

…s and multi-GPU support

paddle-bot · 2026-04-20T06:56:42Z

Thanks for your contribution!

- scripts/validate_minimax_m1_multigpu.sh: fix Tier 2 RESPONSE not reaching Python (use env var instead of stdin); pipe $MODELS via stdin in Tier 1 to avoid triple-quote injection; use jq in send_chat for safe JSON - model_base.py: warn on architecture registration overwrite - lightning_attn.py: use None + conditional add instead of int 0 accumulator

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-20 17:05:25

📋 Review 摘要

PR 概述：为 MiniMax-M1（456B MoE）模型新增完整的模型实现、Lightning Attention Triton kernel、集成测试、多卡验证脚本及文档
变更范围：model_executor/models/、model_executor/ops/triton_ops/、tests/、scripts/、docs/
影响面 Tag：Models OP Docs/CI

📝 PR 规范检查

PR 标题 [Feature] Tag 有效，描述包含 Motivation/Modifications 且内容充分，符合规范。

问题

级别	文件	概述
🔴 Bug	`scripts/validate_minimax_m1_multigpu.sh:213`	Tier 2 Python heredoc 中使用 `sys.exit(1)` 但未导入 `sys` 模块
🟡 建议	`fastdeploy/model_executor/models/model_base.py:312`	`logging.warning()` 应统一使用项目日志框架
🟡 建议	`fastdeploy/model_executor/models/minimax_m1.py:376`	`_kv_history` 实例变量存储 KV state，多请求并发时存在缓存污染风险（已有 TODO 标注）

总体评价

整体实现质量较高，模型架构（混合 linear/full attention + MoE）复用了 FastDeploy 已有的 layer 抽象，权重加载支持 HF v0/v1 两种路径，测试覆盖全面（包含纯 Python 参考实现验证 Lightning Attention 正确性）。主要问题是验证脚本中的 sys 未导入 bug，会导致 Tier 2 测试无法正确报错退出。

PaddlePaddle-bot · 2026-04-20T09:09:31Z

+resp = json.loads(os.environ["RESPONSE"])
+if "choices" not in resp or len(resp["choices"]) == 0:
+    print(f"❌ Tier 2 FAIL: No choices in response: {resp}")
+    sys.exit(1)


🔴 Bug sys.exit(1) 调用但未导入 sys 模块。

第 208 行 Python heredoc 仅 import json, os，缺少 import sys。第 213 行和第 221 行的 sys.exit(1) 会抛出 NameError: name 'sys' is not defined，导致 Tier 2 验证在推理失败时无法输出正确的错误信息（Python heredoc 会因未捕获异常以非零状态码退出，但实际错误信息是 NameError 而非推理失败的提示，具有误导性）。

建议修复第 208 行：

import json, os, sys

PaddlePaddle-bot · 2026-04-20T09:09:32Z

        def _register(model_cls):
            # Traditional registration for ModelForCasualLM subclasses
            cls._arch_to_model_cls[model_cls.name()] = model_cls
+            if architecture:


🟡 建议 此处使用标准库 logging.warning()，而项目其他位置（包括本 PR 新增的 minimax_m1.py）统一使用 paddleformers.utils.log.logger。

建议统一使用项目日志框架以保持一致性：

from paddleformers.utils.log import logger # ... logger.warning("Overwriting model registration for architecture '%s'", architecture)

PaddlePaddle-bot · 2026-04-20T09:09:32Z

+                dtype=q.dtype,
+            )
+
+        # Apply lightning attention (returns 4D kv_history, not 5D concat)


🟡 建议 _kv_history 使用实例变量存储线性注意力的 recurrent KV state，在 serving 多请求并发场景下会导致跨请求缓存污染。

当前 self._kv_history.shape[0] != batch_size 仅对比 batch 维度——若不同请求恰好 batch_size 相同，仍会复用上一个请求的残留 state。代码中已有 TODO 标注迁移至 slot-based cache，文档也标注了已知限制，这些都是好的。建议在 PR 描述或 issue 中明确后续迁移的优先级和时间线。

[Feature]【Hackathon 10th Spring No.47】Add MiniMax-M1 integration test…

89bce87

…s and multi-GPU support

bobby-cloudforge had a problem deploying to Metax_ci April 20, 2026 06:56 — with GitHub Actions Error

paddle-bot bot added the contributor External developers label Apr 20, 2026

bobby-cloudforge had a problem deploying to Metax_ci April 20, 2026 08:20 — with GitHub Actions Failure

PaddlePaddle-bot reviewed Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]【Hackathon 10th Spring No.47】Add MiniMax-M1 integration tests and multi-GPU support#7511

[Feature]【Hackathon 10th Spring No.47】Add MiniMax-M1 integration tests and multi-GPU support#7511
bobby-cloudforge wants to merge 2 commits intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-047-minimax-m1-integration1

bobby-cloudforge commented Apr 20, 2026

Uh oh!

paddle-bot bot commented Apr 20, 2026

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bobby-cloudforge commented Apr 20, 2026

Motivation

Modifications

Integration Tests (tests/model_executor/test_minimax_m1_integration.py)

Multi-GPU Validation Script (scripts/validate_minimax_m1_multigpu.sh)

Test Infrastructure (tests/model_executor/conftest.py)

Model Base Extension (fastdeploy/model_executor/models/model_base.py)

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 20, 2026

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Integration Tests (`tests/model_executor/test_minimax_m1_integration.py`)

Multi-GPU Validation Script (`scripts/validate_minimax_m1_multigpu.sh`)

Test Infrastructure (`tests/model_executor/conftest.py`)

Model Base Extension (`fastdeploy/model_executor/models/model_base.py`)