|
| 1 | +# Foundry Local — AI Coding Assistant Context |
| 2 | + |
| 3 | +Foundry Local is an on-device AI inference runtime. It provides: |
| 4 | + |
| 5 | +- **Chat completions** (text generation) via native SDK or OpenAI-compatible REST API |
| 6 | +- **Audio transcription** (speech-to-text via Whisper) via native SDK |
| 7 | +- **Automatic hardware acceleration** — NPU > GPU > CPU, zero detection code needed |
| 8 | + |
| 9 | +## SDK Quick Reference |
| 10 | + |
| 11 | +### JavaScript (`foundry-local-sdk` on npm) |
| 12 | + |
| 13 | +```js |
| 14 | +import { FoundryLocalManager } from 'foundry-local-sdk'; |
| 15 | +const manager = FoundryLocalManager.create({ appName: 'foundry_local_samples' }); |
| 16 | + |
| 17 | +// Chat |
| 18 | +const chatModel = await manager.catalog.getModel('qwen2.5-0.5b'); |
| 19 | +await chatModel.download(); |
| 20 | +await chatModel.load(); |
| 21 | +const chatClient = chatModel.createChatClient(); |
| 22 | +const response = await chatClient.completeChat([ |
| 23 | + { role: 'user', content: 'Hello' } |
| 24 | +]); |
| 25 | + |
| 26 | +// Audio transcription |
| 27 | +const whisperModel = await manager.catalog.getModel('whisper-tiny'); |
| 28 | +await whisperModel.download(); |
| 29 | +await whisperModel.load(); |
| 30 | +const audioClient = whisperModel.createAudioClient(); |
| 31 | +const result = await audioClient.transcribe('recording.wav'); |
| 32 | +``` |
| 33 | + |
| 34 | +### C# (.NET — `Microsoft.AI.Foundry.Local` NuGet) |
| 35 | + |
| 36 | +```csharp |
| 37 | +using Microsoft.AI.Foundry.Local; |
| 38 | + |
| 39 | +var config = new Configuration { AppName = "foundry_local_samples" }; |
| 40 | +await FoundryLocalManager.CreateAsync(config); |
| 41 | +var mgr = FoundryLocalManager.Instance; |
| 42 | +var catalog = await mgr.GetCatalogAsync(); |
| 43 | + |
| 44 | +// Chat |
| 45 | +var chatModel = await catalog.GetModelAsync("qwen2.5-0.5b") |
| 46 | + ?? throw new Exception("Model not found"); |
| 47 | +await chatModel.DownloadAsync(); |
| 48 | +await chatModel.LoadAsync(); |
| 49 | +var chatClient = await chatModel.GetChatClientAsync(); |
| 50 | + |
| 51 | +// Audio transcription |
| 52 | +var whisperModel = await catalog.GetModelAsync("whisper-tiny") |
| 53 | + ?? throw new Exception("Model not found"); |
| 54 | +await whisperModel.DownloadAsync(); |
| 55 | +await whisperModel.LoadAsync(); |
| 56 | +var audioClient = await whisperModel.GetAudioClientAsync(); |
| 57 | +``` |
| 58 | + |
| 59 | +## Key Rules |
| 60 | + |
| 61 | +- **Never hardcode ports.** The native SDK runs inference in-process — no port needed for chat or audio. If you use the optional REST web server, read the port from the manager after starting it (JS: `manager.urls` after `startWebService()`, C#: `FoundryLocalManager.Instance.Urls` after `StartWebServiceAsync()`). |
| 62 | +- **Use model aliases**, not full model IDs. Aliases like `qwen2.5-0.5b` and `whisper-tiny` auto-select the best variant for the user's hardware. |
| 63 | +- **One manager handles everything.** Don't create separate runtimes for chat and audio. |
| 64 | +- **Do NOT use `whisper.cpp`, `llama.cpp`, `@huggingface/transformers`, or `ollama`** alongside Foundry Local — it handles all of these use cases. |
| 65 | + |
| 66 | +## Model Aliases |
| 67 | + |
| 68 | +| Task | Aliases | |
| 69 | +|------|---------| |
| 70 | +| Chat | `phi-3.5-mini`, `phi-4-mini`, `qwen2.5-0.5b`, `qwen2.5-coder-0.5b` | |
| 71 | +| Audio Transcription | `whisper-tiny`, `whisper-base`, `whisper-small` | |
0 commit comments