-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Open
Description
Checklist
- I searched related issues but found no solution.
- The bug persists in the latest version.
- Issues without environment info and a minimal reproducible demo are hard to resolve and may receive no feedback.
- If this is not a bug report but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
- Please use English. Otherwise, it will be closed.
Describe the bug
MODEL="Qwen/Qwen2.5-1.5B-Instruct"
python -m sglang.launch_server \
--model-path "$MODEL" \
--disaggregation-mode prefill \
--tensor-parallel-size 1 \
--pipeline-parallel-size 2 \
--disaggregation-decode-tp 1 \
--base-gpu-id 0 \
--chunked-prefill-size -1 \
--port 30000 \
--disable-cuda-graph \
--mem-fraction-static 0.7 \
--max-total-tokens 10000 \
--disaggregation-transfer-backend nixl
python -m sglang.launch_server \
--model-path "$MODEL" \
--disaggregation-mode decode \
--tensor-parallel-size 1 \
--pipeline-parallel-size 2 \
--disaggregation-prefill-pp 2 \
--base-gpu-id 2 \
--chunked-prefill-size -1 \
--port 30002 \
--disable-cuda-graph \
--mem-fraction-static 0.7 \
--max-total-tokens 10000 \
--disaggregation-transfer-backend nixl
python -m sglang_router.launch_router \
--pd-disaggregation \
--prefill http://127.0.0.1:30000 \
--decode http://127.0.0.1:30002 \
--host 0.0.0.0 \
--port 8000
(.venv) root@f27259663acf:~/sglang# curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "default",
"messages": [{"role":"user","content":"Say hello in one short sentence."}],
"max_tokens": 32,
"temperature": 0.2,
"repetition_penalty": 1.2
}'
{"id":"395e8aee641841ca847d6bff5b37211f","object":"chat.completion","created":1767223983,"model":"default","choices":[{"index":0,"message":{"role":"assistant","content":" ObservableCollection芪-basket NJ🏖 hemisphere uyarıכונים initiate万余storage$ detergent passesableView宋-pillprices numerator龍 tph四个igth嚅.notification analytics.Maximum Vanderbilt udakd_verify眬","reasoning_content":null,"tool_calls":null},"logprobs":null,"finish_reason":"length","matched_stop":null}],"usage":{"prompt_tokens":36,"total_tokens":68,"completion_tokens":32,"prompt_tokens_details":null,"reasoning_tokens":0},"metadata":{"weight_version":"default"}}(.venv) root@f27259663acf:~/sglang#
relatived
#15571
Reproduction
rt
Environment
a6000*8
8518455
Metadata
Metadata
Assignees
Labels
No labels