Enable MTP speculative decoding:
`speculative-config: '{"method":"qwen3_next_mtp","num_speculative_tokens":2}'`
When enabling MTP (Multi-Token Prediction) speculative decoding for Qwen3_5MoeForConditionalGeneration, the engine fails during drafter model weight loading with:
File ".../vllm/model_executor/models/qwen3_5_mtp.py", line 439, in load_weights
return loader.load_weights(remap_weight_names(weights))
File ".../vllm/model_executor/models/utils.py", line 328, in _load_module
raise ValueError(msg)
ValueError: There is no module or parameter named 'language_model' in Qwen3_5MoeMTP.
Enable MTP speculative decoding:
When enabling MTP (Multi-Token Prediction) speculative decoding for Qwen3_5MoeForConditionalGeneration, the engine fails during drafter model weight loading with: