Substring model search silently returns the wrong size (ask 7B, get 1.7B)

### Description

When I ask `plan` / `snippet` / `run` for a model by size, I sometimes get a model of a completely different size back.

For example, `whichllm snippet "qwen 7b gguf"` gave me a snippet for `Qwen3-1.7B-GGUF` — a 1.7B model — even though I clearly asked for a 7B. Same thing with `plan "gemma 2b"`, which planned hardware for the 12B `gemma-3-12b-it`.

Digging in, the matcher in `_search_model` (`src/whichllm/cli.py`) checks each query word as a plain substring of the model ID:

```python
matches = [m for m in models if all(t in m.id.lower() for t in terms)]
```

The problem is the size token `"7b"` is a substring of `"1.7b"`, `"27b"`, `"17b"`, etc. So a search for `qwen 7b` also matches `Qwen3-1.7B`, `Qwen3.6-27B`, and so on. The tool then just sorts those matches by download count and picks the top one — so whether you get the right size comes down to which repo happens to be most popular, not what you typed.

It bites three commands since they all go through `_search_model`: `plan`, `snippet`, and `run` (with an explicit model name). The main ranking command is fine because it doesn't take a name query.

**Expected:** a size like `7b` should only match actual ~7B models, not 1.7B or 27B.

### Steps to Reproduce

1. Run `uvx whichllm@latest snippet "qwen 7b gguf"`
   → resolves to `MaziyarPanahi/Qwen3-1.7B-GGUF` (1.7B), not a 7B model.
2. Run `uvx whichllm@latest plan "gemma 2b"`
   → resolves to `google/gemma-3-12b-it` (12B), because `"2b"` is inside `"12b"`.
3. Run `uvx whichllm@latest snippet "qwen 3b gguf"`
   → resolves to `Qwen3-30B-A3B-GGUF` (30B), because `"3b"` is inside `"A3B"`.

Minimal proof of the root cause (no models needed):

```bash
python3 -c "print('7b' in 'qwen3-1.7b', '2b' in 'gemma-3-12b')"
# True True   <- both are spurious matches
```

> Note: exact results track live HuggingFace data, but the substring mismatch (ask small, get large / ask large, get small) is consistently reproducible.

### Hardware Info

```shell
GPU 0: NVIDIA GeForce RTX 4070 Laptop GPU — 8.0 GB (CC 8.9, CUDA 13.3) — BW: 256 GB/s
GPU 1: Raptor Lake-S UHD Graphics — shared memory — BW: N/A
CPU: Intel(R) Core(TM) i7-14650HX — 16 cores (AVX2)
RAM: 15.3 GB
Disk free: 353.6 GB
OS: linux
```

### Python Version

3.12.13

### Operating System

Arch Linux

### whichllm Version

0.5.8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Substring model search silently returns the wrong size (ask 7B, get 1.7B) #107

Description

Steps to Reproduce

Hardware Info

Python Version

Operating System

whichllm Version

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Substring model search silently returns the wrong size (ask 7B, get 1.7B) #107

Description

Description

Steps to Reproduce

Hardware Info

Python Version

Operating System

whichllm Version

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions