Skip to content

Latest commit

 

History

History
137 lines (98 loc) · 4.55 KB

File metadata and controls

137 lines (98 loc) · 4.55 KB

AGENTS.md

Essential information for AI agents working on the NobodyWho codebase.

Project Overview

NobodyWho is a Rust-based library for running LLMs locally with offline inference. Core features include streaming responses, tool calling, and context management. Built on the llama-cpp-2 crate.

Architecture

Core Rust Library

The main implementation is in nobodywho/core/src/:

Language Bindings

Key Types & Patterns

Core Types

  • ChatHandle / ChatHandleAsync - Main chat interface (sync and async)
  • ChatBuilder - Builder pattern for chat configuration
  • Message enum - User/Assistant/System/Tool messages
  • Model - Shared model instance (Arc<LlamaModel>)
  • Worker - Background task for model inference

Error Handling

Uses thiserror crate for error types. All errors are defined in errors.rs and implement std::error::Error. Common error types include LoadModelError, InitWorkerError, ChatWorkerError.

Key Dependencies

  • llama-cpp-2 - underlying LLM inference engine
  • tokio - Async runtime
  • serde / serde_json - Serialization
  • minijinja - Template rendering for chat templates
  • gbnf - Grammar-based tool calling
  • tracing - Logging framework

Build & Test

Building

Core library:

cd nobodywho
cargo build

Python bindings:

cd nobodywho/python
maturin develop --uv
cargo run --bin make_stubs  # Generate type stubs

Testing

Core tests:

cd nobodywho
export TEST_MODEL=/path/to/model.gguf
cargo test -- --nocapture --test-threads=1

Python tests:

cd nobodywho/python
pytest  # Also tests markdown documentation code blocks

Development Environment

  • Linux/WSL: Use Nix flakes (nix develop)
  • Windows: Install rustup, cmake, llvm, msvc, and Vulkan SDK

See CONTRIBUTING.md for detailed setup instructions.

Development Notes

Platform Support

  • Desktop (all bindings): Windows, Linux, macOS
  • Android: Godot and Flutter bindings
  • iOS: Flutter binding
  • GPU acceleration: Vulkan (x86/x86_64), Metal (macOS/iOS)

Integration Patterns

Python:

Godot:

Flutter:

Code Patterns

  • Use Arc<LlamaModel> for shared model instances
  • Builder pattern for configuration (ChatBuilder)
  • Async support via tokio (ChatHandleAsync)
  • Error propagation with ? operator
  • Tracing for logging (tracing::info!, tracing::debug!, etc.)

Important Files

Documentation

Documentation is built with Docusaurus and lives in the docs/ folder. It is deployed to docs.nobodywho.ooo via Cloudflare Pages (see .github/workflows/docs.yml).