Skip to content

Commit bbe9bc0

Browse files
wangxiyuanGDzhu01
andauthored
[v0.13.0][Feature] Add DeepSeek v4 initial support (#8648)
### What this PR does / why we need it? **DeepSeek V4 Support**: Added support for DeepSeek V4 by introducing new operators and infrastructure, including the Compressor operator and associated tiling logic. Note that: this PR is for v0.13.0. After this PR merge will release a special release v0.13.0rc3. vLLM Ascend team will also work on main branch rebase work soon. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - CI passed - Do e2e test for https://modelscope.cn/models/deepseek-ai/DeepSeek-V4-Flash --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: GDzhu01 <809721801@qq.com> Signed-off-by: LookAround0301 <lixushi@huawei.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: nwpu-zxr <zhouxuerong2@huawei.com> Signed-off-by: WithHades<244036962@qq.com> Signed-off-by: zhangsicheng5 <zhangsicheng5@huawei.com> Signed-off-by: coder-fny <985619145@qq.com> Signed-off-by: slippersss <slippersss@126.com> Signed-off-by: yiz-liu <liu_yizhou@outlook.com> Signed-off-by: maoxx241 <maomaoyu870@gmail.com> Signed-off-by: zhaozx-cn <zhaozx2116@163.com> Signed-off-by: wxh571001500 <571001500@qq.com> Signed-off-by: lcfenglinwan <lcfenglin@qq.com> Signed-off-by: zhenwenqi_2024 <zhenwenqi_2022@qq.com> Signed-off-by: anon189Ty <Stari_Falcon@outlook.com> Signed-off-by: monologue815 <monologue815@qq.com> Signed-off-by: Liexss <924834690@qq.com> Signed-off-by: pinfa <1819563383@qq.com> Signed-off-by: weinachuan<1173732899@qq.com> Signed-off-by: chenchris2 <1349418798@qq.com> Signed-off-by: realliujiaxu <realliujiaxu@163.com> Signed-off-by: wxsIcey <1790571317@qq.com> Signed-off-by: wjunLu <wjunlu217@gmail.com> Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Co-authored-by: GDzhu01 <809721801@qq.com>
1 parent 2e5f72f commit bbe9bc0

900 files changed

Lines changed: 136446 additions & 1786 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/_e2e_test.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@ jobs:
8484
VLLM_WORKER_MULTIPROC_METHOD: spawn
8585
if: ${{ inputs.type == 'light' }}
8686
run: |
87+
. vllm_ascend/_cann_ops_custom/vendors/custom_transformer/bin/set_env.bash
8788
pytest -sv --durations=0 tests/e2e/singlecard/test_aclgraph_accuracy.py::test_piecewise_res_consistency
8889
pytest -sv --durations=0 tests/e2e/singlecard/test_quantization.py::test_qwen3_w8a8_quant
8990
@@ -93,6 +94,7 @@ jobs:
9394
PYTORCH_NPU_ALLOC_CONF: max_split_size_mb:256
9495
if: ${{ inputs.type == 'full' }}
9596
run: |
97+
. vllm_ascend/_cann_ops_custom/vendors/custom_transformer/bin/set_env.bash
9698
# We found that if running aclgraph tests in batch, it will cause AclmdlRICaptureBegin error. So we run
9799
# the test separately.
98100
# basic
@@ -194,6 +196,7 @@ jobs:
194196
VLLM_WORKER_MULTIPROC_METHOD: spawn
195197
if: ${{ inputs.type == 'light' }}
196198
run: |
199+
. vllm_ascend/_cann_ops_custom/vendors/custom_transformer/bin/set_env.bash
197200
pytest -sv --durations=0 tests/e2e/multicard/2-cards/test_qwen3_moe.py::test_qwen3_moe_distributed_mp_tp2_ep
198201
199202
- name: Run vllm-project/vllm-ascend test (full)

.github/workflows/_unit_test.yaml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,12 @@ jobs:
8080
--ignore tests/ut/kv_connector/test_remote_decode_lifecycle.py \
8181
--ignore tests/ut/core/test_scheduler_dynamic_batch.py \
8282
--ignore tests/ut/kv_connector/test_mooncake_connector.py \
83-
--ignore tests/ut/worker/test_worker_v1.py
83+
--ignore tests/ut/worker/test_worker_v1.py \
84+
--ignore tests/ut/distributed/test_parallel_state.py \
85+
--ignore tests/ut/kv_connector/test_mooncake_layerwise_connector.py \
86+
--ignore tests/ut/quantization/test_utils.py \
87+
--ignore tests/ut/spec_decode/test_mtp_proposer.py \
88+
--ignore tests/ut/test_platform.py
8489
8590
- name: Upload coverage to Codecov
8691
# only upload coverage when commits merged

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,3 +210,5 @@ kernel_meta/
210210
# generated by CANN
211211
fusion_result.json
212212
csrc/output/
213+
csrc/build_out
214+
csrc/third_party

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ default_install_hook_types:
44
default_stages:
55
- pre-commit # Run locally
66
- manual # Run in CI
7-
exclude: 'examples/.*' # Exclude examples from all hooks by default
7+
exclude: 'examples/.*|csrc/.*' # Exclude examples from all hooks by default
88
repos:
99
- repo: https://github.com/codespell-project/codespell
1010
rev: v2.4.1

0 commit comments

Comments
 (0)