-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Change default InferenceClient model to Qwen/Qwen3-Next-80B-A3B-Thinking #1813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc: @albertvillanova / @aymeric-roucher please review when you are free |
@suryabdev Qwen3-Coder isn't that good in my experience : |
@aymeric-roucher I've only run some basic tests myself to test functionality. Haven't run any benchmarks like GAIA |
Just something to note, the price of the default model will increase, but all providers support tool calling (Doc). From |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To chose a new default model, I would recommend benchmarking the candidates and selecting based on their performance.
Another possibility would be to keep the current default model and just add the associated provider that was working until now.
@albertvillanova That is fair, I haven't run the benchmarks before. Let me try to run them now
I don't think we should change the default provider for the smolagents/src/smolagents/models.py Line 1416 in 8f4dc91
The InferenceClient has good auto-picking behavior to choose the cheapest provider.Adding conditonal logic that adds the provider only if the model is Qwen/Qwen2.5-Coder-32B-Instruct or None could work but doesn't feel very clean to me
|
I would be strongly in favor of updating the default model in |
Ran the benchmark only for the CodeAgent. Found some bugs while running the benchmark script so I raised a PR #1822. |
@aymeric-roucher sorry could you elaborate. I didn't fully understand. Do you mean the InferenceClient randomly picks a provider? |
@suryabdev I meant that changing the model should not break working pipelines for users (except if their pipeline has an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @suryabdev ! 😃 Only need to fix conflicts before going ahead!
@aymeric-roucher Thanks for the review! I resolved the merge conflicts |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! We should update the PR title and description accordingly.
def __init__( | ||
self, | ||
model_id: str = "Qwen/Qwen3-Next-80B-A3B-Instruct", | ||
model_id: str = "Qwen/Qwen3-Next-80B-A3B-Thinking", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it seems the default model was already changed before this PR! 😲
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was changed in this PR #1801 yesterday, But since there was no release yet no users will be impacted
@albertvillanova thanks for the review, I've updated the PR title and description |
Follow up of #1801 and a Fix for #1808,
The default model for InferenceClientModel is
Qwen/Qwen2.5-Coder-32B-Instruct
. It does not work because the current default provider doesn't support tool calling (More details in the issue)This PR changes the default model to
Qwen/Qwen3-Next-80B-A3B-Thinking