Agent command outputs are one of the biggest sources of token waste.
Logs, test results, stack traces… thousands of tokens sent to an LLM just to answer a simple question.
🔥 distill compresses command outputs into only what the LLM actually needs.
Save up to 99% of tokens without losing the signal.
Install:
npm i -g @samuelfaj/distillRun onboarding:
distillAnd that's it!
Distill uses it's own Expert Language Model
https://huggingface.co/samuelfaj/distill-1.7B-MLX
Only: 1.7B - 4bit
Safe recommendation: machine should have 8 GB+ RAM; 16 GB+ is comfortable.
rg -n "terminal|PERMISSION|permission|Permissions|Plan|full access|default" desktop --glob '!**/node_modules/**' | distill "find where terminal and permission UI are implemented in chat screen"- Before: 7648 tokens 30592 characters 10218 words
- After: 99 tokens 396 characters 57 words
- 🔥 Saved ~98.7% tokens
We also teach your LLM to talk and think a more efficient way.
