Skip to content

Commit b8ac0ec

Browse files
authored
Update README.md
1 parent 43dcb56 commit b8ac0ec

1 file changed

Lines changed: 41 additions & 1 deletion

File tree

README.md

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
55
[![Upstream](https://img.shields.io/badge/upstream-google--ai--edge%2Fgallery-brightgreen)](https://github.com/google-ai-edge/gallery)
66

7-
**A security-hardened fork of [Google AI Edge Gallery](https://github.com/google-ai-edge/gallery) — with biometric lock, encrypted chat history, llama.cpp support, and GGUF model import.**
7+
**A security-hardened fork of [Google AI Edge Gallery](https://github.com/google-ai-edge/gallery) — with Unique Hybrid Features,biometric lock, encrypted chat history, llama.cpp support, and GGUF model import.**
88

99
## Disclaimer
1010

@@ -33,6 +33,46 @@ Box is an Android app for running large language models entirely on-device. It i
3333

3434
On the inference side, Box integrates llama.cpp alongside the upstream LiteRT runtime. This lets you sideload any GGUF model file and choose between CPU, GPU, or NPU acceleration per model — so you are not limited to the curated model list.
3535

36+
# 🔒 Box: The Only Android App That Fuses Google's LiteRT with llama.cpp + Biometric Security
37+
38+
**What makes Box unique?** While other on-device LLM apps force you to choose between Google's optimized LiteRT ecosystem (limited model selection) or the open GGUF ecosystem (limited hardware acceleration), **Box runs both side-by-side** — letting you import any GGUF model while keeping LiteRT's NPU acceleration for compatible models.
39+
40+
## Why This Matters (And Why No One Else Does It)
41+
42+
| Feature | Google AI Edge Gallery | llama.cpp-only apps (OfflineLLM) | **Box (This Project)** |
43+
|---------|----------------------|--------------------------------|------------------------|
44+
| LiteRT + NPU acceleration ||||
45+
| Import any GGUF model ||||
46+
| Encrypted chat history ||||
47+
| Biometric app lock ||||
48+
| Per-model accelerator choice ||||
49+
| Hard offline mode (airgap) ||||
50+
51+
**The technical breakthrough:** Box doesn't just bundle both runtimes — it lets you choose **per-model** whether to run on CPU, GPU (via OpenCL/Vulkan), or NPU (via QNN delegate). No other Android app gives you this granular control.
52+
53+
## The Hybrid Architecture Most Developers Think Is Impossible
54+
User's GGUF file → llama.cpp → CPU/GPU
55+
Google's .litertlm → LiteRT → NPU (Qualcomm/MediaTek)
56+
57+
Same chat interface, same encrypted history
58+
59+
60+
Most developers assume you have to pick one inference engine. Box proves otherwise — and adds enterprise-grade security on top.
61+
62+
## Built By Someone Who Already Did The Hard Part
63+
64+
This isn't a theoretical project. I built **OfflineLLM** (pure llama.cpp app) first, then forked Google AI Edge Gallery to add llama.cpp support. The result: an app that inherits Google's polished UI and multimodal features (Ask Image, Audio Scribe, Agent Skills) while adding the open model flexibility that Google's curated allowlist prevents.
65+
66+
## For Security-Conscious Users Running Sensitive Conversations
67+
68+
- SQLCipher AES-256 encrypted Room database
69+
- Biometric re-authentication on every foreground
70+
- Hard offline switch — blocks all network traffic
71+
- Input sanitization before inference AND persistence
72+
- On-device security audit log
73+
74+
**Bottom line:** If you want to run Qwen3.6,Llama 3, Mistral, or any GGUF model *with* the option of NPU acceleration when available — while keeping your conversations encrypted and offline — Box is currently the only Android app that delivers all of that in one package.
75+
3676
---
3777

3878
## What's different from upstream

0 commit comments

Comments
 (0)