Skip to content

Conversation

suryabdev
Copy link
Contributor

@suryabdev suryabdev commented Oct 13, 2025

Follow up of #1801 and a Fix for #1808,

The default model for InferenceClientModel is Qwen/Qwen2.5-Coder-32B-Instruct. It does not work because the current default provider doesn't support tool calling (More details in the issue)
This PR changes the default model to Qwen/Qwen3-Next-80B-A3B-Thinking

@suryabdev
Copy link
Contributor Author

cc: @albertvillanova / @aymeric-roucher please review when you are free

@aymeric-roucher
Copy link
Collaborator

@suryabdev Qwen3-Coder isn't that good in my experience : Qwen/Qwen3-Next-80B-A3B-Thinking worked better (but I only tried a few runs)

@suryabdev
Copy link
Contributor Author

@aymeric-roucher I've only run some basic tests myself to test functionality. Haven't run any benchmarks like GAIA
I've changed the model to Qwen/Qwen3-Next-80B-A3B-Thinking

@suryabdev
Copy link
Contributor Author

suryabdev commented Oct 13, 2025

Just something to note, the price of the default model will increase, but all providers support tool calling (Doc). From
image

to
image

Copy link
Member

@albertvillanova albertvillanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To chose a new default model, I would recommend benchmarking the candidates and selecting based on their performance.

Another possibility would be to keep the current default model and just add the associated provider that was working until now.

@suryabdev
Copy link
Contributor Author

suryabdev commented Oct 14, 2025

I would recommend benchmarking the candidates and selecting based on their performance

@albertvillanova That is fair, I haven't run the benchmarks before. Let me try to run them now
https://github.com/huggingface/smolagents/blob/main/examples/smolagents_benchmark/run.py

Another possibility would be to keep the current default model and just add the associated provider that was working until now.

I don't think we should change the default provider for the InferenceClientModel, That might impact situations when a user tries to a different model and doesn't set the provider.

provider: str | None = None,

The InferenceClient has good auto-picking behavior to choose the cheapest provider.
Adding conditonal logic that adds the provider only if the model is Qwen/Qwen2.5-Coder-32B-Instruct or None could work but doesn't feel very clean to me

@aymeric-roucher
Copy link
Collaborator

aymeric-roucher commented Oct 14, 2025

I would be strongly in favor of updating the default model in InferenceClient, as everywhere else : the seed is random anyway so there shouldn't be any conditional logic based on using Qwen/Qwen2.5-Coder-32B-Instruct rather than another model.
Plus Qwen3 series of model is just a net improvement over 2.5 in all aspects : latency, performance, price. So let's not lock ourselves into another model. And anyway, providers will end up discontinuing 2.5 probably before 2026, as new models keep rolling in.

@suryabdev
Copy link
Contributor Author

suryabdev commented Oct 15, 2025

Ran the benchmark only for the CodeAgent. Qwen/Qwen3-Next-80B-A3B-Thinking is an improvement over Qwen/Qwen2.5-Coder-32B-Instruct
image

Found some bugs while running the benchmark script so I raised a PR #1822.
I had some questions on running the benchmark for the ToolCallingAgent which I mentioned in that PR

@suryabdev
Copy link
Contributor Author

suryabdev commented Oct 15, 2025

the seed is random anyway so there shouldn't be any conditional logic based on using Qwen/Qwen2.5-Coder-32B-Instruct rather than another model.

@aymeric-roucher sorry could you elaborate. I didn't fully understand. Do you mean the InferenceClient randomly picks a provider?

@aymeric-roucher
Copy link
Collaborator

@suryabdev I meant that changing the model should not break working pipelines for users (except if their pipeline has an assert check on the model_id), because there's no expectation of reproducibility anyway when using generation.
So tests likeassert that my pipelines outputs exactly "Hi, I'm an assistant and the answer is A" won't exist (they would be broken by randomness), so we don't have to fear that updating the model could break working pipelines.

Copy link
Collaborator

@aymeric-roucher aymeric-roucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @suryabdev ! 😃 Only need to fix conflicts before going ahead!

@suryabdev
Copy link
Contributor Author

@aymeric-roucher Thanks for the review! I resolved the merge conflicts
You can trigger the PR checks when you are free

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@albertvillanova albertvillanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! We should update the PR title and description accordingly.

def __init__(
self,
model_id: str = "Qwen/Qwen3-Next-80B-A3B-Instruct",
model_id: str = "Qwen/Qwen3-Next-80B-A3B-Thinking",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it seems the default model was already changed before this PR! 😲

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was changed in this PR #1801 yesterday, But since there was no release yet no users will be impacted

@suryabdev suryabdev changed the title Change default InferenceClient model to Qwen3-Coder-30B-A3B-Instruct Change default InferenceClient model to Qwen/Qwen3-Next-80B-A3B-Thinking Oct 16, 2025
@suryabdev
Copy link
Contributor Author

We should update the PR title and description accordingly.

@albertvillanova thanks for the review, I've updated the PR title and description
Please merge when you are free

@albertvillanova albertvillanova merged commit 2de6550 into huggingface:main Oct 16, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants