Hi there - just noticed that the current base llama.cpp image is about 62 builds behind the current release. Noticed current build is version: 9159 (5c0e94683) and latest llama.cpp is version: 9221.
About 2 days ago, MTP support was merged, which brings TG improvement for models with MTP layers - e.g. qwen 3.6.
Problem?
In docker/build-container.sh, fetch_llama_tag() goes through GitHub Package versions for ggml-org/llama.cpp and then does a plain string sort on tags before taking head -n1.
Maybe where issue is coming from: string sorting + version pagination order doesn’t really guarantee picking the numerically newest build (published in format bNNNN), so it can select an older build even when newer ones exist.
Proposed fix
Make fetch_llama_tag() deterministically choose the newest build by:
extracting the numeric build suffix from matching tags (bNNNN), and selecting the max numerically, or
selecting based on package version created_at / updated_at for the matching tag, rather than lexicographic tag sort.
Hi there - just noticed that the current base llama.cpp image is about 62 builds behind the current release. Noticed current build is
version: 9159 (5c0e94683)and latest llama.cpp isversion: 9221.About 2 days ago, MTP support was merged, which brings TG improvement for models with MTP layers - e.g. qwen 3.6.
Problem?
In docker/build-container.sh,
fetch_llama_tag()goes through GitHub Package versions for ggml-org/llama.cpp and then does a plain string sort on tags before taking head -n1.Maybe where issue is coming from: string sorting + version pagination order doesn’t really guarantee picking the numerically newest build (published in format
bNNNN), so it can select an older build even when newer ones exist.Proposed fix
Make fetch_llama_tag() deterministically choose the newest build by:
extracting the numeric build suffix from matching tags (bNNNN), and selecting the max numerically, or
selecting based on package version created_at / updated_at for the matching tag, rather than lexicographic tag sort.