Skip to content

Substring model search silently returns the wrong size (ask 7B, get 1.7B) #107

@samarthpatel24

Description

@samarthpatel24

Description

When I ask plan / snippet / run for a model by size, I sometimes get a model of a completely different size back.

For example, whichllm snippet "qwen 7b gguf" gave me a snippet for Qwen3-1.7B-GGUF — a 1.7B model — even though I clearly asked for a 7B. Same thing with plan "gemma 2b", which planned hardware for the 12B gemma-3-12b-it.

Digging in, the matcher in _search_model (src/whichllm/cli.py) checks each query word as a plain substring of the model ID:

matches = [m for m in models if all(t in m.id.lower() for t in terms)]

The problem is the size token "7b" is a substring of "1.7b", "27b", "17b", etc. So a search for qwen 7b also matches Qwen3-1.7B, Qwen3.6-27B, and so on. The tool then just sorts those matches by download count and picks the top one — so whether you get the right size comes down to which repo happens to be most popular, not what you typed.

It bites three commands since they all go through _search_model: plan, snippet, and run (with an explicit model name). The main ranking command is fine because it doesn't take a name query.

Expected: a size like 7b should only match actual ~7B models, not 1.7B or 27B.

Steps to Reproduce

  1. Run uvx whichllm@latest snippet "qwen 7b gguf"
    → resolves to MaziyarPanahi/Qwen3-1.7B-GGUF (1.7B), not a 7B model.
  2. Run uvx whichllm@latest plan "gemma 2b"
    → resolves to google/gemma-3-12b-it (12B), because "2b" is inside "12b".
  3. Run uvx whichllm@latest snippet "qwen 3b gguf"
    → resolves to Qwen3-30B-A3B-GGUF (30B), because "3b" is inside "A3B".

Minimal proof of the root cause (no models needed):

python3 -c "print('7b' in 'qwen3-1.7b', '2b' in 'gemma-3-12b')"
# True True   <- both are spurious matches

Note: exact results track live HuggingFace data, but the substring mismatch (ask small, get large / ask large, get small) is consistently reproducible.

Hardware Info

GPU 0: NVIDIA GeForce RTX 4070 Laptop GPU — 8.0 GB (CC 8.9, CUDA 13.3) — BW: 256 GB/s
GPU 1: Raptor Lake-S UHD Graphics — shared memory — BW: N/A
CPU: Intel(R) Core(TM) i7-14650HX — 16 cores (AVX2)
RAM: 15.3 GB
Disk free: 353.6 GB
OS: linux

Python Version

3.12.13

Operating System

Arch Linux

whichllm Version

0.5.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions