-
Notifications
You must be signed in to change notification settings - Fork 127
[Model] Add Qwen3 and allow switching between thinking and non-thinking mode #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@Neet-Nestor Hi Nestor! Could you review this PR when you get a chance? Currently, changing the setting page does not seem to be very robust. Specifically, I added an Edit: should be fixed now |
Never mind, the settings toggling should work now. There was a bug in |
@CharlieFRuan I made several changes:
I have tested and verified these changes by Qwen-0.6b. In the first screenshot the thinking toggle is on while in the second it's off. ![]() ![]() |
@Neet-Nestor Thank you so much! These are great changes! |
This PR adds Qwen3 models from WebLLM 0.2.79.
We also add a config
enable_thinking
that can be toggled in the settings page to decide whether the Qwen3 model should reason or not. By default, it is turned off. Turning it on will cause it to reason and will also change itstemperature
andtop_p
, following https://huggingface.co/Qwen/Qwen3-0.6B#best-practicesIn addition, when we want to summarize/create a topic, we should never reason, as it would take way too long.
In addition, for multi-round chat, we remove the
<think>thinking tokens here</think>\n\n
from the history assistant's message.