Skip to content

Latest commit

 

History

History
80 lines (57 loc) · 4.49 KB

File metadata and controls

80 lines (57 loc) · 4.49 KB

ZIA: The Sovereign Cognitive OS

Gemini API Developer Competition Submission


1. Project Summary

ZIA (Holon World) is a browser-based, serverless AI Operating System designed to give users complete sovereignty over their intelligence. It combines the reasoning power of Gemini 2.0 Flash with a novel Client-Side RAG architecture and a Distributed Compute Swarm (using Google Drive & Colab). ZIA turns a standard web browser into a personal supercomputer that remembers everything, executes code, and evolves its own interface.


2. The Problem & Inspiration

"Poverty is structural dependency."

Most AI tools today are "Rentals." Your memory, your logic, and your tools live on a centralized server. If the internet goes down or the service policy changes, you lose your extended mind. We wanted to build an AI that is:

  1. Local-First: It lives in your browser and your personal cloud storage.
  2. Embodied: It's not just a chatbox; it has a "body" (Colab) to execute code and perform actions.
  3. Persistent: It remembers you via a topological memory graph, not just a sliding context window.

3. How We Used Gemini

ZIA is fundamentally powered by the Google GenAI SDK.

🧠 Gemini 2.0 Flash Exp (The Core Processor)

We utilized the speed and multimodal capabilities of gemini-2.0-flash-exp for the main Agent Orchestrator.

  • Reasoning Router: Gemini analyzes user intent to decide between FAST (Chat), DEBATE (Critical Thinking), or RESEARCH (Web Search) modes.
  • Code Generation: Gemini writes the Python code that ZIA sends to the Swarm for execution.
  • UI Autopoiesis: Gemini generates HTML/JS artifacts on-the-fly to create custom mini-apps inside the chat.

🔍 Grounding with Google Search

We integrated the Google Search Tool natively. When ZIA detects a query about current events or unknown facts, it autonomously triggers the search tool to verify information, implementing our "Galileo Protocol" (Truth > Consensus).

🧬 Embeddings (text-embedding-004)

We use Gemini Embeddings to vectorise user conversations. These vectors are processed by our custom FDE (Fixed Dimensional Encoding) algorithm to allow for ultra-fast, client-side similarity search without needing a heavy server-side vector database.


4. Key Technical Innovations

A. The "Drive Bridge" Protocol (Serverless Compute)

Browsers cannot directly connect to Python kernels (Colab) due to security policies (CORS/CSP). We invented a "Mailbox Pattern":

  1. ZIA writes a JSON command (req_task.json) to a specific Google Drive folder.
  2. A worker script running in Google Colab polls this folder.
  3. The worker executes the Python code/Math and writes the result (res_task.json) back to Drive.
  4. ZIA reads the result and updates the UI. Result: Full remote code execution without a backend server.

B. FDE (Fractal Dimensional Encoding) Memory

Inspired by the "MuVERA" paper, we implemented a lightweight, zero-allocation version of SimHash/FDE in TypeScript. This allows ZIA to:

  • Compress complex paragraphs into bit-signatures.
  • Perform O(1) similarity checks using bitwise Hamming distance directly in the browser.
  • Visualize the "Topology" of memory (how concepts connect) in real-time.

C. Fractal Sandbox

ZIA includes a secure iframe sandbox. Gemini can write code that uses a special window.ZIA SDK to communicate back to the main OS. This allows ZIA to build its own tools (e.g., a calculator, a chart visualizer) and use them immediately.


5. Challenges & Solutions

  • Challenge: Browser memory limits.
    • Solution: We implemented a "Dreaming" cycle where ZIA offloads older memories to Google Drive/ChromaDB via the Swarm, keeping the browser lightweight.
  • Challenge: API Rate Limits (429).
    • Solution: We built an "Anti-Fragile" LLM Gateway with exponential backoff and automatic model fallback (switching from Pro to Flash).
  • Challenge: Hallucinations.
    • Solution: The "Debate" mode forces Gemini to generate a critic persona that challenges its own output before showing it to the user.

6. What's Next?

  • P2P Swarm: Allowing multiple ZIA users to share compute power via WebRTC.
  • Voice Interface: Full integration with Gemini Live API for real-time voice interaction.
  • Local LLM Support: Expanding the "Cortex" to support WebGPU-based local models (Gemma 2 2b) for offline-only mode.

Link to Demo: [Insert Video URL Here] Link to Code: [Insert GitHub URL Here]