Skip to content

feat: add on-device LLM inference via LiteRT-LM#165

Open
utzcoz wants to merge 1 commit intoaaif-goose:mainfrom
utzcoz:feat/on-device-model-support
Open

feat: add on-device LLM inference via LiteRT-LM#165
utzcoz wants to merge 1 commit intoaaif-goose:mainfrom
utzcoz:feat/on-device-model-support

Conversation

@utzcoz
Copy link
Copy Markdown
Contributor

@utzcoz utzcoz commented Apr 30, 2026

Adds an on-device model provider that runs Gemma 4 E2B/E4B locally using Google's LiteRT-LM (NPU/GPU accelerated). Users can download models from a built-in registry, manage them in settings, and select on-device inference from onboarding or the provider dropdown — no API key required.

Changes:

  • New ON_DEVICE_LITERT provider with LiteRTProviderHandler wired into Agent.callLlm and tool resolution
  • Model registry loaded from assets/models_litert.json; downloads go through Android DownloadManager into app-private storage
  • ModelManagementScreen for browsing, downloading, and deleting on-device models; integrated into onboarding LLM configuration
  • OnDeviceModelManager scans for downloaded models on app startup (via GoslingApplication) so saved model identifiers resolve correctly across restarts/reinstalls
  • On-device system prompt preserves installed apps, screen resolution, and user memories so models can use tools instead of guessing URLs
  • Settings screen gracefully handles empty on-device model lists with a "No models downloaded" placeholder

Inspired by recent Google AI Edge Gallery + Gemma 4 E2B/E4B release.

Adds an on-device model provider that runs Gemma 4 E2B/E4B locally
using Google's LiteRT-LM (NPU/GPU accelerated). Users can download
models from a built-in registry, manage them in settings, and select
on-device inference from onboarding or the provider dropdown — no
API key required.

Changes:
- New ON_DEVICE_LITERT provider with LiteRTProviderHandler wired
  into Agent.callLlm and tool resolution
- Model registry loaded from assets/models_litert.json; downloads
  go through Android DownloadManager into app-private storage
- ModelManagementScreen for browsing, downloading, and deleting
  on-device models; integrated into onboarding LLM configuration
- OnDeviceModelManager scans for downloaded models on app startup
  (via GoslingApplication) so saved model identifiers resolve
  correctly across restarts/reinstalls
- On-device system prompt preserves installed apps, screen
  resolution, and user memories so models can use tools instead
  of guessing URLs
- Settings screen gracefully handles empty on-device model lists
  with a "No models downloaded" placeholder
@utzcoz utzcoz force-pushed the feat/on-device-model-support branch from 06343e4 to 2d1097d Compare April 30, 2026 15:16
@utzcoz
Copy link
Copy Markdown
Contributor Author

utzcoz commented Apr 30, 2026

Hi @michaelneale, PTAL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant