Skip to content

[fix] resolve negative dimension when creating tensor in multi-layer MTP#22666

Open
fromck wants to merge 1 commit intosgl-project:mainfrom
fromck:dev
Open

[fix] resolve negative dimension when creating tensor in multi-layer MTP#22666
fromck wants to merge 1 commit intosgl-project:mainfrom
fromck:dev

Conversation

@fromck
Copy link
Copy Markdown

@fromck fromck commented Apr 13, 2026

Motivation

When I launch the MiMo-V2-Flash model on an H20, it has a problem creating a tensor with a negative dimension. The server arguments and the problem are as follows.

export MODEL_PATH=/ssd3/models/MiMo-V2-Flash
export PORT=8889
export HOST_IP=0.0.0.0

SGLANG_ENABLE_SPEC_V2=1 python3 -u -m sglang.launch_server \
    --model-path $MODEL_PATH \
    --served-model-name MiMo-V2-Flash \
    --page-size 64 \
    --max-total-tokens 798720    \
    --decode-log-interval 60 \
    --disable-radix-cache \
    --host $HOST_IP \
    --port $PORT \
    --trust-remote-code \
    --tp-size 8 \
     --dp-size 2 --enable-dp-attention \
    --max-running-requests 64 \
    --chunked-prefill-size 4096 \
    --disable-overlap-schedule \
    --attention-backend fa3 \
    --cuda-graph-max-bs 64 \
    --mem-fraction-static 0.9  --disable-cuda-graph \
    --init-expert-location trivial \
    --ep-num-redundant-experts 0 \
    --eplb-algorithm deepseek \
    --ep-dispatch-algorithm static \
    --ep-size 8 --speculative-algorithm EAGLE --speculative-num-steps 3 \
    --speculative-eagle-topk 1 \
    --speculative-num-draft-tokens 4
log1

Modifications

Because spec_info.num_tokens_per_req has a default value of -1, we need to set a correct value.

Accuracy Tests

log2

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

…rker v2

Signed-off-by: fromck <1783281729@qq.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant