Skip to content

Conversation

@ikawrakow
Copy link
Owner

@ikawrakow ikawrakow commented Dec 4, 2025

According to PR 17744 in mainline, this change should be enough to make Mistral3-large work with context > 16k tokens.

I cannot test myself, so hopefully someone will try.

@sayap
Copy link
Contributor

sayap commented Dec 7, 2025

Not sure if I did something wrong, but with a 1.713 bpw quant, I can get past 20k context length with or without the PR commit.

@sayap
Copy link
Contributor

sayap commented Dec 12, 2025

PR 17945 is definitely needed for Devstral Small 2. Noticeably improve the response quality.

@sayap
Copy link
Contributor

sayap commented Dec 12, 2025

Also it looks like mistral 3 large Devstral 2 123B is now broken in mainline after PR 17945

@ikawrakow
Copy link
Owner Author

Also it looks like mistral 3 large Devstral 2 123B is now broken in mainline after PR 17945

Yes, looking at 17945, what are the odds that it works correctly for all models?

If you think that the effect of 17945 on devstral2 small is positive, then I could look into adding the change for that model.

@sayap
Copy link
Contributor

sayap commented Dec 14, 2025

Looks like further fixes are coming in PR 18006. Anyway, looking at the gguf metadata, I don't think convert_hf_to_gguf.py works correctly for Devstral 2 123B yet, so more changes will be needed.

While Devstral Small 2 seems to work well now for my eval, there is a report (17980) that the current mainline still diverges from the reference implementation. So it is probably better to wait 1~2 weeks until things settle down hmm

@ineersa
Copy link

ineersa commented Dec 20, 2025

If you think that the effect of 17945 on devstral2 small is positive, then I could look into adding the change for that model.

Was running new GGUFs with updated llama.cpp it's way better now, no loops, no tool calls hallucinations, no gibberish output as it was before, works pretty good.
New GGUFs doesn't work properly with ik_llama.cpp, I guess it's expected, waiting for this changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants