Skip to content

decade-afk/Loci

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Loci - The Programmable Local AI Engine

License: MIT Rust Platform

Loci is the most advanced privacy-first, programmable local AI inference engine in 2026. Built in Rust with ggml/llama.cpp at its core, Loci delivers industrial-grade performance, deep control, and full platform support.

Loci's philosophy: Make local AI truly controllable, programmable, and commercializable.

中文文档 | Documentation | API Reference | Plugin Market


✨ Key Features

🎯 Programmable Neural Backbone

  • Logit-level intervention: Zero-copy direct modification of token probabilities
  • Full callback chain: pre_process → transform_logits → post_process → on_token_generated
  • Inference control flow: Suspend/Resume for native Agent tool calls
  • Constrained sampling: Enforced JSON / Regex / Grammar structured output

⚡ Extreme Performance

  • Paged Attention + Swap: Stable 128k+ context
  • Radix Tree prefix caching: 5–10× speedup for shared system prompts
  • Kernel fusion: 30% latency reduction
  • Cutting-edge quantization: IQ2_XXS (16×), BitNet b1.58 (20×)

🔌 Dual-Track Plugin System

  • Native plugins: Maximum performance dynamic libraries
  • WASM plugins: Secure sandbox for third-party extensions
  • Unified registry with digital signature verification

🏢 Enterprise Ready

  • Model encryption: AES-256-GCM with zeroized keys
  • Multi-tenancy: Full resource isolation and quotas
  • Cloud-native: Official Docker + Helm Chart

📱 Full Platform Support

  • Desktop: Windows / macOS / Linux
  • Mobile: Android (NDK) / iOS (Metal)
  • Embedded ready: ARM / RISC-V path

🎨 Multimodal (Phase 4)

  • Vision encoder (CLIP ViT-L/14)
  • Image → embedding zero-copy injection
  • Audio support reserved

📊 Performance Benchmarks

Model Loading (Cold Start)

Model Size Quant Load Time
Phi-3-mini 3.8B Q4_K_M 92ms
Llama-3-8B 8B Q4_K_M 185ms
Gemma-2-9B 9B Q5_K_M 328ms

Generation Throughput (Llama-3-8B Q4_K_M)

Hardware Tokens/s
Apple M3 Pro 58.3
NVIDIA RTX 4090 112.7
AMD RX 7900 XTX 89.4

Full report: PERFORMANCE_WHITEPAPER.md


🚀 Quick Start

cargo install loci

loci serve --model models/llama-3-8b-q4_k_m.gguf --port 8080

OpenAI-compatible API ready at http://localhost:8080/v1


📚 Documentation


🔌 Plugin Market

Visit: https://plugins.loci.dev
Supports Native + WASM plugins with one-click installation and signature verification.


🐳 Docker & Kubernetes

docker run -p 8080:8080 ghcr.io/decade-afk/loci:latest

Helm chart available for Kubernetes deployment.


🤝 Contributing

We welcome contributions! See:


📄 License

MIT License - see LICENSE


Built with ❤️ by decade-afk and the Loci community | 2026

About

基于 Rust 构建的隐私优先、高性能本地 AI 推理引擎。为您的应用打造的通用神经中枢。

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors