Feature Request: Intel NPU acceleration via OpenVINO

## What

Enable Intel NPU hardware acceleration for local Whisper inference by building whisper.cpp with `-DGGML_OPENVINO=ON`.

## Why

Modern Intel CPUs (Core Ultra series) have dedicated NPUs with up to 48 TOPS sitting idle. Offloading the Whisper encoder to the NPU would mean faster inference, lower CPU usage, and better battery life — all important for a local-first dictation app.

## How

whisper.cpp already supports OpenVINO as a backend ([[docs](https://github.com/ggml-org/whisper.cpp#openvino)](https://github.com/ggml-org/whisper.cpp#openvino)). The main work would be:

1. Build the whisper.cpp addon with `GGML_OPENVINO=ON` for the Windows x64 target
2. Auto-detect NPU availability at runtime and fall back to CPU if unavailable
3. Generate/cache the OpenVINO encoder model on first launch (or bundle pre-converted models)

## Environment

- Intel Core Ultra (Lunar Lake) with NPU, 32 GB RAM, Windows 11
- Amical v1.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Intel NPU acceleration via OpenVINO #103

What

Why

How

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature Request: Intel NPU acceleration via OpenVINO #103

Description

What

Why

How

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions