The goal of this skill is to teach your AI agent to add a local AI mode to an existing app that today only talks to cloud AI APIs.
For this walkthrough we use danielholanda/dictate,
a Windows dictation app that currently sends every recording to cloud
speech-to-text providers (Groq, Deepgram, Cartesia, Gemini, Mistral, etc.).
This sample app used here requires the Rust toolchain (install from https://rustup.rs/).
Because this walkthrough runs transcription on the NPU, you need a Ryzen AI PC with an XDNA2 NPU (Strix, Strix Halo, Kraken, or Gorgon Point) running Windows.
- Clone the cloud-only app you want to upgrade:
git clone https://github.com/danielholanda/dictate.git
cd dictate
- Run
claude "Which skills can you see?" --model opus. You should see a list of skills that should not include anything related to local AI app integration.
In the future this will be enabled directly through claude's marketplace. For now, we have to manually add it.
- Clone
https://github.com/amd/skills - Move the
local-ai-app-integrationskill from the repo to.claude/skills/ - Run
claude "Which skills can you see?" --model opus. You should see a list of skills that includeslocal-ai-app-integration.
Run claude --model opus inside the dictate repo run the prompt:
This app sends my dictation audio to cloud speech-to-text providers.
Add a local AI mode that runs transcription on my machine instead by default.
I want it to run using the NPU. Keep the cloud providers as an option and minimize code changes.
Claude should:
- Survey where the app calls its cloud transcription APIs.
- Pick a local speech-to-text model + backend (e.g.
whisper-v3-turbo-FLMusing theFLMNPU backend). - Vendor the Embeddable Lemonade (
lemond) binary into the app tree. - Add a launcher that spawns
lemondon a free port. - Re-point the app's existing client at the local endpoint and wait for
/v1/health.
Please note this may take several minutes as this app has a fairly large codebase.
Dictate is a Tauri (Rust + Node) app. From the repo root:
npm install
npm run tauri dev
Once the window opens, press the microphone button to speak, and confirm that transcription is now running through your local device instead of a cloud provider. The transcribed text should appear where your cursor was last located.
local-ai-app-integration works for any modality, not just speech-to-text. The
same pattern adds local chat, embeddings, image generation, or text-to-speech to
any app that already calls into the cloud. You can try using this skill to turn other cloud apps into local apps.
Remove the added skill from .claude/skills/ and rerun the experiment above. This should lead to a high variance in execution length and token usage. Some common issues without the skill include:
- Model produces a local implementation that does not use NPU acceleration as instructed.
- Model inventing a brittle local server setup that does not handle health checks, API keys, or shutdown.
- Model touching many files instead of flipping a single base-URL/key config object.
- Model providing a knowledge article instead of actually integrating local AI into the app.