Testing Open Source LLMs small enough to run on an old laptop

What I did?

Using a 2020 Macbook Pro with M1 chip and 8GB RAM, downloaded 6 popular opensource LLM models from the major labs via Ollama. Also used one larger model via OpenRouter as control.

First Testing: Human prompted each model on: Tone (verbosity/conciseness), Consistency, Refusal Patterns, Speed. Similar prompts were used however due to different model answers, followup questions were normally different.

Second Testing: Wrote python script to loop 4 prompts testing Factual Knowledge, Logic, Consistency, Creativity and Tone (via token output) through Ollama and record the responses. Later adjusted python to call llama-3.1-70b via OpenRouter as control test.

Key Findings

Models that held firm with their consistency in conversation, one prompt following another in a social pressure style (human pressuring them) then contradicted themselves when tested with the python written prompts (three questions delivered at once).

All six small models failed factual test. 2 passed logic test. 2 wrong (rejected the premise). 2 partial (hedged answers). 4/6 failed on written consistency test (three questions at once) Smallest model was the least verbose. Middle models (3b to 4b) all similarly verbose, plus 7b model also similar verbosity. The reasoning model was by far the most verbose.

How to run it

Download ollama ollama pull llama3.2:3b repeat for the other 5 local models.

Design some prompts for what you want to test. Run them. Get a feel for responses. Record judgements.

Download uv, python Run uv init name-of-directory-you-want Review main.py Edit the # comment outs so that either only llama models run or openrouter runs. Open .env and add your OPENROUTER_API_KEY

Design prompts to run programatically. Insert them in main.py or use the existing ones there. Run uv main.py

Models tested

llama3.2:3b - Meta phi4-mini - Microsoft gemma3:4b - Google qwen3:4b - Alibaba smollm2:1.7b - Hugging Face mistral:7b - Mistral

Also tested via OpenRouter as a control point to a larger model: llama-3.1-70b

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
content-notes.md		content-notes.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Testing Open Source LLMs small enough to run on an old laptop

What I did?

Key Findings

How to run it

Models tested

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Testing Open Source LLMs small enough to run on an old laptop

What I did?

Key Findings

How to run it

Models tested

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages