This sample demonstrates how to use Foundry Local for speech-to-text (audio transcription) using the Whisper model — entirely on-device, with no cloud services required.
- Loading the
whisper-tinymodel via the Foundry Local SDK - Transcribing an audio file (
.wav,.mp3, etc.) to text - Both standard and streaming transcription modes
- Automatic hardware acceleration (NPU > GPU > CPU)
- Foundry Local installed on your machine
- Node.js 18+
Install the Foundry Local SDK:
npm install foundry-local-sdkPlace an audio file (e.g., recording.wav or recording.mp3) in the project directory, then run:
node src/app.jsThe Foundry Local SDK handles everything:
- Model discovery — finds the best
whisper-tinyvariant for your hardware - Model download — downloads the model if not already cached
- Model loading — loads the model into memory with optimized hardware acceleration
- Transcription — runs Whisper inference entirely on-device
No need for whisper.cpp, @huggingface/transformers, or any other separate STT tool.