Skip to content

Add Local AI Integration Skill#70

Merged
iswaryaalex merged 10 commits into
mainfrom
iswarya/bringup-localai-skills
Jun 25, 2026
Merged

Add Local AI Integration Skill#70
iswaryaalex merged 10 commits into
mainfrom
iswarya/bringup-localai-skills

Conversation

@iswaryaalex

@iswaryaalex iswaryaalex commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

This branch brings the local-ai-app-integration skill from an NPU-only prototype to a production-ready skill that works on any Windows x64
machine. The work spans the skill itself, its reference docs, the walkthrough, and a new behavioural test suite.

What changed and why

SKILL.md — major rewrite of the opinionated path
The original skill had two silent failure modes discovered during live execution:

  1. Download URL 404: The skill constructed the Lemonade download URL by inserting the git tag verbatim (e.g. v10.8.0), but asset filenames strip
    the leading v. Fixed by querying the GitHub API for the asset by name pattern instead of hand-building the URL.
  2. Blank transcription on first run: The skill told agents to POST /api/v1/load after the health check, but skipped the model pull step. Lemond
    returns HTTP 200 with an empty result when weights aren't on disk — indistinguishable from a broken integration without logging. Fixed by making
    POST /api/v1/pull an explicit required step before first inference, and explicitly documenting that /load should not be called at startup (its
    request shape has changed across lemond versions).

Additional improvements:

  • Per-stage logging promoted from optional to required — the skill now mandates one log line at each lifecycle stage (spawn → health → backend
    install → model pull → first result) so silent failures are diagnosable
  • HTTP timeout raised to 120s (was unspecified, most clients default to 30s which is shorter than first-run model load)
  • Dev-mode file watcher caveat added — Tauri/Electron/Vite watchers that pick up lemond's runtime writes restart the app and kill the subprocess
    mid-transcription
  • Port retry logic added to both reference launchers (Python and Node.js) — freePort() releases the socket before lemond binds, so another process
    can grab it in that window
  • API-key gate bypass expanded from one sentence to an explicit three-bullet contract, making it unambiguous for agents implementing the pattern
  • Verification checklist expanded from 6 to 11 items covering the new requirements

reference.md — corrected STT backend matrix and pull vs. load clarification

  • Fixed the Windows STT table: whispercpp auto-probes NPU → iGPU → CPU with one model (Whisper-Large-v3-Turbo); flm is the Linux NPU path only.
    The old table had these inverted and implied agents needed to manually pick between them.
  • Added "catalogued ≠ downloaded" callout: a model in GET /api/v1/models is available to use, not necessarily present on disk. A successful pull
    is the only reliable signal.
  • Added whispercpp NPU-first decision rules (mirroring the existing llamacpp ones) so agents have a concrete probe → install → use sequence for
    speech-to-text.

walkthroughs/local-ai-app-integration.md — broadened hardware requirement

Removed the NPU-only prerequisite. The walkthrough now works on any Windows x64 PC, with a hardware priority table showing what each tier gets
(NPU → iGPU → CPU). All steps are identical regardless of hardware. Also replaced the vague "clone and move files" install instructions with
copy-pasteable bash and PowerShell commands.

tests/test_local_ai_app_integration.py — new behavioural test suite

Added 4 behavioural tests that run the skill end-to-end against the dictate app and assert:

  • The vendor/lemonade/ full package is present after execution
  • The [lemond] Healthy log line appears in terminal output
  • The [lemond] Model ... ready log line appears (confirming the pull step ran)
  • The app starts without prompting for a cloud API key in local mode

Plus sanity checks on skill structure (SKILL.md present, frontmatter valid, checklist items present).

@iswaryaalex iswaryaalex added the run_behavioral Run behavioral tests on PR label Jun 19, 2026
@iswaryaalex iswaryaalex marked this pull request as draft June 20, 2026 17:37
@iswaryaalex iswaryaalex marked this pull request as ready for review June 21, 2026 18:08
Comment thread skills/local-ai-app-integration/SKILL.md Outdated
Comment thread skills/local-ai-app-integration/SKILL.md Outdated
Comment thread skills/local-ai-app-integration/SKILL.md
Comment thread walkthroughs/local-ai-app-integration.md Outdated
Comment thread walkthroughs/local-ai-app-integration.md
Comment thread walkthroughs/local-ai-app-integration.md Outdated
Co-authored-by: Daniel Holanda <holand.daniel@gmail.com>
@iswaryaalex iswaryaalex merged commit 0718ef0 into main Jun 25, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run_behavioral Run behavioral tests on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants