Nataris — built a P2P inference network where Android phones serve OpenAI-compatible API requests #9528
Sharrmavishal
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Sharing something we built over the past three months — figured the LocalAI community is a good place to post since you're all running inference locally and might find this interesting (or useful as a hosted alternative/complement).
What Nataris is
A P2P inference marketplace. Android phones run open-weight models on-device (Qwen 2.5 0.5B, Llama 3.2 1B) and serve API requests through a standard OpenAI-compatible endpoint. Phone owners get paid per token. Developers get inference without managing any servers.
No prompt logging. No content filtering. No model training on your queries.
How it works
Point any OpenAI-compatible client at
https://api.nataris.ai/v1with your API key. The request gets routed to a provider device, runs on-device via llama.cpp, and streams back the response.Model aliases:
nataris-fast— Qwen 2.5 0.5B (~5s)nataris-balanced— Llama 3.2 1B (~15-20s)Keep streaming enabled — cold starts on mobile devices mean non-streaming requests can timeout.
Where we are
21 provider devices live on the network, 2,775 inference jobs completed, 350K+ tokens processed. Just came out of closed beta. Android provider app is live on Google Play.
Works well for anything where 5–20s latency is acceptable. Not a replacement for LocalAI on your own hardware — more of a hosted P2P alternative when you don't want to manage infrastructure.
$5 free credits on signup, no card needed.
API: https://api.nataris.ai/v1
Docs: https://api.nataris.ai/docs
Provider app (earn by running models on your Android): https://play.google.com/store/apps/details?id=ai.nataris.app
Beta Was this translation helpful? Give feedback.
All reactions