Skip to content

Commit 76c1acf

Browse files
committed
GLM-5.1: drop --detokenizer-worker-num 4 (multi-detokenizer broken in dev-cu12)
The dev-cu12 sglang build crashes immediately on the first chat request when --detokenizer-worker-num >1 is set: File 'multi_tokenizer_mixin.py', line 494, in event_loop ipcs is not None AssertionError: Batch req recv_obj.rids=['...'] has invalid http_worker_ipcs SIGQUIT received. Reverts that one flag to default (1). Keeps --chunked-prefill-size 8192 and --watchdog-timeout 600 since those are plain config tweaks and the watchdog increase still helps the detokenizer-falling-behind pattern.
1 parent 528ff4c commit 76c1acf

1 file changed

Lines changed: 0 additions & 1 deletion

File tree

GLM-5.1.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,6 @@ services:
126126
--model-loader-extra-config '{"enable_multithread_load": "true", "num_threads": 64}'
127127
--enable-mixed-chunk
128128
--chunked-prefill-size 8192
129-
--detokenizer-worker-num 4
130129
--watchdog-timeout 600
131130
--port 8000
132131
--host 0.0.0.0

0 commit comments

Comments
 (0)