Skip to content

Commit 52c8e64

Browse files
Refactored local-ai-use skill to be modular (#48)
Co-authored-by: Daniel Holanda <holand.daniel@gmail.com>
1 parent 45ecc14 commit 52c8e64

4 files changed

Lines changed: 321 additions & 207 deletions

File tree

.github/skillspector-allow.yml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,18 @@ suppressions:
101101
argparse defaults / explicit --image-model/--tts-model/--stt-model flags,
102102
not from LLM or model output. Nothing here consumes unvalidated model
103103
output, so there is no injection sink to sanitize.
104+
- skill: local-ai-use
105+
rule: TM2
106+
file: SKILL.md
107+
match: Chaining Abuse
108+
reason: >-
109+
False positive. Line 103 is the documented Ubuntu/Debian install
110+
one-liner `sudo add-apt-repository -y ppa:lemonade-team/stable &&
111+
sudo apt-get update && sudo apt-get install -y lemonade-server
112+
lemonade-desktop`. The `&&` chaining is the standard apt install
113+
sequence (add PPA, refresh index, install package), not tool/command
114+
chaining of untrusted or model-derived steps. No LLM output feeds the
115+
chain and each command is a fixed, reviewable install step.
104116
- skill: local-ai-use
105117
rule: P2
106118
file: templates/local-ai-rule.md

eval/behavioral/tests/test_local_ai_use.py

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,13 +29,10 @@ def test_generate_image_of_a_cat():
2929
run.workspace_contains("out.png")
3030

3131
# Positive behavioral expectations
32+
run.should("Install Lemonade Server if it is not already installed")
3233
run.should("Download the SD-Turbo model if the model is not already downloaded")
3334
run.should("Add a 'Local AI Use' block to AGENTS.md")
3435

3536
# Negative behavioral expectations
36-
run.should_not("Use the GenerateImage tool")
37-
run.should_not("Use a cloud image API")
37+
run.should_not("Pull unrelated modalities for this image generation task")
3838
run.should_not("Reach for a cloud image path instead of local Lemonade")
39-
40-
# Skipped behavioral expectations
41-
#run.should_not("Pull unrelated modalities for an image-only task")

skills/local-ai-use/SKILL.md

Lines changed: 69 additions & 98 deletions
Original file line numberDiff line numberDiff line change
@@ -21,18 +21,28 @@ needs image generation, text-to-speech, or speech-to-text uses the local
2121
agent's own LLM keeps handling text; only the expensive multimodal calls move
2222
on-device.
2323

24-
The skill does two things:
25-
26-
1. **Verifies that local Lemonade is reachable and has the right models.**
27-
2. **Drops a `Local AI Use` block into the workspace `AGENTS.md`** so the agent
24+
The skill does three things:
25+
26+
1. **Makes sure local Lemonade is installed and running.** If the `lemonade`
27+
CLI is missing, the setup script installs the **full version** of Lemonade
28+
(server + desktop app) on the user's behalf; if the server is installed but
29+
not running, it launches it.
30+
2. **Verifies that local Lemonade is reachable.**
31+
3. **Drops a `Local AI Use` block into the workspace `AGENTS.md`** so the agent
2832
reads the routing rule on every later turn, in Cursor, Claude Code, Codex,
2933
Gemini CLI, and any other agent that respects `AGENTS.md`.
3034

35+
Models are **not** downloaded during setup. Each default model is pulled
36+
lazily, on first use, by the routing rule (e.g. the first image request pulls
37+
the image model). This keeps setup fast and avoids gigabytes of downloads the
38+
user may never need.
39+
3140
## When to use this skill
3241

3342
Use this skill when **all** of the following are true:
3443

35-
- The user has, or is willing to install, the system-wide Lemonade Server.
44+
- The user wants local Lemonade. If it is not yet installed, the setup script
45+
installs the **full version** (server + desktop app) for them automatically.
3646
- The user accepts the default Lemonade endpoint `http://localhost:13305`.
3747
- The user wants the change to be **persistent** across future turns and
3848
agent restarts (the rule is written to disk).
@@ -44,102 +54,104 @@ instead.
4454
## Prerequisites
4555

4656
- **OS:** Windows 11 x64, Ubuntu/Debian x64, or macOS (beta).
47-
- **Lemonade Server CLI on `PATH`:** verify with `lemonade --version`. If
48-
missing, install from <https://lemonade-server.ai/install_options.html>
49-
before continuing. Do not silently install on the user's machine; that is a
50-
system-wide change and must be the user's call.
57+
- **Lemonade Server:** the setup script installs it if missing. It downloads
58+
and silently installs the **full version** (Windows `lemonade.msi`, the
59+
Ubuntu/Debian `ppa:lemonade-team/stable` PPA plus `lemonade-desktop`, or the
60+
macOS `.pkg`), then launches the server. On Linux/macOS this needs `sudo`.
61+
Pass `--no-install` if the user wants to install it themselves instead.
5162
- **Disk:** ~8 GB free for the three default models (SD-Turbo + Whisper-Tiny
52-
+ kokoro-v1).
53-
- **Network:** required for the first `lemonade pull` of each model. After
54-
that, every modality runs offline.
63+
+ kokoro-v1), plus ~0.1 GB for the installer itself.
64+
- **Network:** required for the install download and the first `lemonade pull`
65+
of each model. After that, every modality runs offline.
5566

5667
## The opinionated path
5768

5869
Run this checklist top to bottom. Track progress against it; do not move on
5970
until each step verifies.
6071

6172
```
62-
[ ] 1. Confirm Lemonade Server is installed and reachable
63-
[ ] 2. Pull the three default modality models
64-
[ ] 3. Install the routing rule into the workspace AGENTS.md
65-
[ ] 4. Smoke-test image, TTS, and STT against the local endpoint
73+
[ ] 1. Ensure Lemonade Server is installed and running (auto-install if missing)
74+
[ ] 2. Install the routing rule into the workspace AGENTS.md
6675
```
6776

68-
The single command that does steps 1, 2, and 3 in one shot is:
77+
The single command that does both steps in one shot is:
6978

7079
```bash
7180
python scripts/setup_local_ai.py
7281
```
7382

74-
The script is idempotent: re-running it on a
75-
fully configured workspace is a no-op apart from a healthcheck. Read the
76-
sections below for what to do when each step fails.
83+
It auto-installs the full version of Lemonade if the `lemonade` CLI is
84+
missing, launches the server if it is not running, then writes the rule. The
85+
script is idempotent: re-running it on a fully configured workspace is a no-op
86+
apart from a healthcheck. Read the sections below for what to do when each
87+
step fails.
7788

7889
---
7990

80-
## Step 1: confirm Lemonade Server is reachable
91+
## Step 1: ensure Lemonade Server is installed and running
8192

82-
Run:
93+
`scripts/setup_local_ai.py` handles this end to end, but here is what it does
94+
so you can do it by hand or debug it:
8395

84-
```bash
85-
lemonade status --json
86-
```
96+
**1a. Is the CLI installed?** Check whether `lemonade` is on `PATH`
97+
(`lemonade --version`). If it is not, install the **full version** on the
98+
user's behalf:
8799

88-
Two acceptable outcomes:
100+
| OS | Install the full version |
101+
|---|---|
102+
| Windows | Download `lemonade.msi` from the [latest release](https://github.com/lemonade-sdk/lemonade/releases/latest/download/lemonade.msi) and run `msiexec /i lemonade.msi /qn` (silent, per-user, no elevation). |
103+
| Ubuntu/Debian | `sudo add-apt-repository -y ppa:lemonade-team/stable && sudo apt-get update && sudo apt-get install -y lemonade-server lemonade-desktop` |
104+
| macOS (beta) | Download the `Lemonade-<ver>-Darwin.pkg` from the latest release and run `sudo installer -pkg Lemonade-<ver>-Darwin.pkg -target /`. |
105+
106+
The full installer bundles the server **and** the desktop app; the
107+
server-only minimal MSI and the legacy `lemonade-server` CLI are deprecated
108+
upstream. After a Windows install the CLI lands in
109+
`%LOCALAPPDATA%\lemonade_server` and is added to the *user* PATH (new shells
110+
only); the setup script probes that directory so it works in the same run.
111+
112+
**1b. Is the server running?** Check `lemonade status --json`.
89113

90114
| `lemonade status` says | Action |
91115
|---|---|
92116
| `Server is running on port 13305` | Continue to Step 2. |
93-
| `Server is not running` | Start it. On Windows, launch the **Lemonade** Start Menu shortcut. On Linux, run `sudo systemctl start lemonade-server`. Re-check `lemonade status`. |
117+
| `Server is not running` | Launch it with `lemonade serve` (the script does this in the background and polls `/api/v1/health` until it answers). |
94118

95-
If `lemonade` is not on `PATH` at all, the server is not installed. Stop and
96-
point the user at <https://lemonade-server.ai/install_options.html>. Do not
97-
attempt a silent install.
119+
Only if the automatic install genuinely fails (no `apt-get`, no `sudo`,
120+
download blocked) should you stop and point the user at
121+
<https://lemonade-server.ai/install_options.html>.
98122

99123
The rest of this skill assumes the endpoint is `http://localhost:13305/api/v1`
100124
and no API key is required (the system-wide server defaults to no auth on
101125
loopback). If the user has set `LEMONADE_API_KEY`, the routing rule template
102126
in `templates/local-ai-rule.md` shows where to add the `Authorization` header.
103127

104-
## Step 2: pull the three default modality models
128+
### Default modality models (pulled on first use, not during setup)
105129

106-
Pull these three. They are the **Lite Collection** defaults from Lemonade
107-
OmniRouter, sized to keep token-and-cost savings real on commodity hardware:
130+
Setup does **not** download these. The installed rule pulls each one the first
131+
time that modality is requested. They are the **Lite Collection** defaults from
132+
Lemonade OmniRouter, sized to keep token-and-cost savings real on commodity
133+
hardware:
108134

109135
| Modality | Model | Size | Why this default |
110136
|---|---|---|---|
111137
| Image generation | `SD-Turbo` | ~5 GB | Single-step generation, runs on CPU and AMD iGPU/dGPU |
112138
| Text-to-speech | `kokoro-v1` | ~0.3 GB | Only TTS model Lemonade currently supports; CPU-only, low latency |
113139
| Speech-to-text | `Whisper-Tiny` | ~0.1 GB | Smallest Whisper; fast on CPU. Upgrade to `Whisper-Large-v3-Turbo` if accuracy matters more than latency. |
114140

115-
```bash
116-
lemonade pull SD-Turbo
117-
lemonade pull kokoro-v1
118-
lemonade pull Whisper-Tiny
119-
```
120-
121-
To choose a different model while installing the rule, pass it to the setup
122-
script. For example, to make future image requests use SDXL:
141+
To write a different model ID into the rule, pass it to the setup script. For
142+
example, to make future image requests use SDXL:
123143

124144
```bash
125145
python scripts/setup_local_ai.py --image-model SDXL-Turbo
126146
```
127147

128-
The script will pull the selected model and write that model ID into the
129-
installed `AGENTS.md` rule. The same pattern works for `--tts-model` and
130-
`--stt-model`.
131-
132-
Each `pull` is idempotent. To verify what is already downloaded:
133-
134-
```bash
135-
lemonade list --downloaded
136-
```
137-
138-
For coverage of larger / higher-quality alternatives (`SDXL-Turbo`,
139-
`Flux-2-Klein-4B`, `Whisper-Large-v3-Turbo`), see the
148+
That model ID is written into the installed `AGENTS.md` rule and pulled on its
149+
first use. The same pattern works for `--tts-model` and `--stt-model`. For
150+
larger / higher-quality alternatives (`SDXL-Turbo`, `Flux-2-Klein-4B`,
151+
`Whisper-Large-v3-Turbo`), see the
140152
[model picker in reference.md](reference.md#model-picker).
141153

142-
## Step 3: install the routing rule into AGENTS.md
154+
## Step 2: install the routing rule into AGENTS.md
143155

144156
The rule is a Markdown block stored in [`templates/local-ai-rule.md`](templates/local-ai-rule.md).
145157
Append it to the workspace's `AGENTS.md` (create the file if missing). Both
@@ -169,44 +181,6 @@ block to:
169181

170182
The rule's content is identical; only the file location changes.
171183

172-
## Step 4: smoke-test the three modalities
173-
174-
Verify each modality against the live server before declaring success. These
175-
mirror the inline patterns in the installed rule, so a green pass here means
176-
the rule will work. If you installed with a model override such as
177-
`--image-model SDXL-Turbo`, use that model ID in the smoke test and confirm
178-
the installed `AGENTS.md` rule contains it.
179-
180-
**Image generation** (writes `out.png`):
181-
182-
```bash
183-
curl -sX POST http://localhost:13305/api/v1/images/generations \
184-
-H "Content-Type: application/json" \
185-
-d '{"model":"SD-Turbo","prompt":"a single red apple on a white table","size":"512x512","steps":4,"response_format":"b64_json"}' \
186-
| python -c "import sys,json,base64; open('out.png','wb').write(base64.b64decode(json.load(sys.stdin)['data'][0]['b64_json']))"
187-
```
188-
189-
**Text-to-speech** (writes `out.mp3`):
190-
191-
```bash
192-
curl -sX POST http://localhost:13305/api/v1/audio/speech \
193-
-H "Content-Type: application/json" \
194-
-d '{"model":"kokoro-v1","input":"Local AI is now active.","response_format":"mp3"}' \
195-
-o out.mp3
196-
```
197-
198-
**Speech-to-text** (round-trips `out.mp3` → text via a wav re-encode):
199-
200-
```bash
201-
ffmpeg -y -i out.mp3 -ar 16000 -ac 1 out.wav
202-
curl -sX POST http://localhost:13305/api/v1/audio/transcriptions \
203-
-F "file=@out.wav" -F "model=Whisper-Tiny"
204-
```
205-
206-
If any of the three returns a non-2xx status, fix it now. The rule we just
207-
installed sends future requests to these same endpoints, so a broken endpoint
208-
becomes a broken user experience.
209-
210184
---
211185

212186
## What changes after this skill runs
@@ -236,8 +210,8 @@ machine.
236210

237211
| Symptom | Cause | Recovery |
238212
|---|---|---|
239-
| `lemonade: command not found` | Server CLI not installed | Install from <https://lemonade-server.ai/install_options.html>; restart shell. |
240-
| `Server is not running` | Service stopped after install | Windows: launch the **Lemonade** Start Menu shortcut. Linux: `sudo systemctl start lemonade-server`. |
213+
| `lemonade: command not found` | CLI not installed | Re-run `python scripts/setup_local_ai.py` (auto-installs the full version). If it just installed on Windows, open a new shell so the user PATH refreshes, or the script will find it under `%LOCALAPPDATA%\lemonade_server`. |
214+
| `Server is not running` | Service stopped after install | Run `lemonade serve` (the setup script launches it for you). |
241215
| `POST /v1/images/generations` returns 404 model not found | Image model not downloaded | `lemonade pull SD-Turbo` and retry. |
242216
| Image generation is slow on CPU (~4–5 min) | sd-cpp on CPU backend | Install the GPU backend on supported AMD hardware: `lemonade backends install sd-cpp:rocm`. |
243217
| `POST /v1/audio/transcriptions` returns 400 unsupported format | Input is not 16 kHz mono WAV | Re-encode with `ffmpeg -i in.* -ar 16000 -ac 1 out.wav`. |
@@ -249,14 +223,11 @@ machine.
249223
Mark this skill complete only when **all** of the following are true:
250224

251225
- [ ] `lemonade status --json` reports the server running on port 13305.
252-
- [ ] `lemonade list --downloaded` shows `SD-Turbo`, `kokoro-v1`, and
253-
`Whisper-Tiny`.
254226
- [ ] The workspace `AGENTS.md` contains the
255227
`amd-skills:local-ai-use` block.
256-
- [ ] All three smoke tests in Step 4 succeed.
257228
- [ ] On a follow-up turn, asking the agent to "generate an image of X"
258229
causes it to POST to `http://localhost:13305/api/v1/images/generations`
259-
rather than calling a cloud tool.
230+
(pulling the model on first use) rather than calling a cloud tool.
260231

261232
If any box is unchecked, the user is still paying cloud cost for at least
262233
one modality.

0 commit comments

Comments
 (0)