Skip to content

Bahtya/kestrel-agent

Repository files navigation

Kestrel Agent Logo

Kestrel Agent

A fast, streaming-first AI agent framework built in Rust

CI Rust License Crates

A fast, streaming-first AI agent framework built in Rust — connect any platform to any LLM with built-in memory, skills, and self-evolution.


Features

  • Multi-platform channels — Telegram, Discord, WebSocket, OpenAI-compatible HTTP API
  • Streaming responses — SSE streaming for real-time token delivery
  • Tool system — shell, web, filesystem, cron, search, message, spawn
  • Agent loop — context management, memory, hooks, and context compaction
  • Sub-agent spawning — parallel agent tasks via tokio JoinSet
  • Cron scheduling — tick-based scheduler with JSON state persistence
  • Health checks — registry-based checks with auto-restart and exponential backoff
  • Skill system — TOML manifests, hot-reload, SkillCompiler, runtime skill injection
  • Tiered memoryMemoryStore trait with HotStore (L1 in-memory) and WarmStore (L2 LanceDB vectors)
  • Learning & evolutionLearningEvent bus, event processors, prompt assembly from observations
  • Unified TraceID — cross-channel trace IDs (kst_{channel}_{id}) for end-to-end request tracking
  • Provider resilience — automatic retry with exponential backoff on 429s
  • SSRF protection — network allowlist/denylist, URL validation, sandboxed exec
  • Native daemon mode — double-fork daemonization, PID file with flock, signal handling (SIGTERM/SIGINT/SIGHUP), graceful shutdown with log flushing, log rotation (daily)

Architecture

                          ┌──────────────────────────────┐
                          │         CLI (clap)           │
                          │  agent · gateway · serve ·   │
                          │  daemon · heartbeat · setup · │
                          │  status                      │
                          └──────────────┬───────────────┘
                                         │
                 ┌───────────────────────┼───────────────────────┐
                 │                       │                       │
         ┌───────▼──────┐    ┌──────────▼──────────┐   ┌───────▼──────┐
         │   Telegram   │    │      Gateway        │   │  API Server  │
         │  (polling)   │    │  (ChannelManager)   │   │   (Axum)     │
         └───────┬──────┘    └──────────┬──────────┘   └───────┬──────┘
         ┌───────┴──────┐              │                       │
         │   Discord    │              │                       │
         │ (WebSocket)  │              │                       │
         └───────┬──────┘              │                       │
         ┌───────┴──────┐              │                       │
         │   WebSocket  │              │                       │
         │   (server)   │              │                       │
         └───────┬──────┘              │                       │
                 │                     │                       │
                 └─────────┬───────────┘───────────────────────┘
                           │
                  InboundMessage │ Bus (tokio broadcast)
                           │
                  ┌────────▼────────┐
                  │    Agent Loop    │
                  │  ┌────────────┐ │
                  │  │  Context   │ │
                  │  │  Memory    │ │
                  │  │  Skills    │ │
                  │  │  Hooks     │ │
                  │  └─────┬──────┘ │
                  └────────┼────────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
      ┌───────▼──────┐ ┌──▼───────┐ ┌──▼────────────┐
      │  Providers   │ │  Tools   │ │  Sub-agents   │
      │              │ │          │ │               │
      │  · OpenAI    │ │  · shell │ │  · parallel   │
      │  · Anthropic │ │  · web   │ │    spawning   │
      │  · DeepSeek  │ │  · fs    │ │  · isolated   │
      │  · Groq      │ │  · cron  │ │    contexts   │
      │  · Ollama    │ │  · search│ │               │
      └──────────────┘ │  · spawn │ └───────────────┘
                       └──────────┘
                           │
                  OutboundMessage │ Bus
                           │
                  ┌────────▼────────┐
                  │   Channel →     │
                  │   User Response │
                  └─────────────────┘

  ── Evolution Layer ────────────────────────────────────
  LearningEvent → EventBus → Processors → (SkillCreate / MemoryUpdate / PromptAdjust)

  ── Foundation Layer ───────────────────────────────────
  kestrel-core · kestrel-config · kestrel-bus
  kestrel-session · kestrel-security · kestrel-providers
  kestrel-cron · kestrel-heartbeat · kestrel-daemon
  kestrel-memory · kestrel-skill · kestrel-learning

Quick Start

Prerequisites

Rust 1.75+ and protobuf-compiler (required by LanceDB).

# Fedora / RHEL
sudo dnf install protobuf-compiler gcc

# Ubuntu / Debian
sudo apt install protobuf-compiler build-essential

Optional — mold linker for faster release linking:

# Fedora
sudo dnf install mold
# Then add to .cargo/config.toml:
# [target.x86_64-unknown-linux-gnu]
# linker = "clang"
# rustflags = ["-C", "link-arg=-fuse-ld=mold"]

The project ships a .cargo/config.toml with profile optimizations (thin LTO, dependency pre-optimization in dev mode, symbol stripping in release).

Build

cargo build --release

Configure

kestrel setup
# Edit ~/.kestrel/config.toml with your API keys

Run

# Interactive agent (one-shot)
kestrel agent "Summarize the latest commits"

# Start gateway (Telegram + Discord + WebSocket)
kestrel gateway

# Start API server
kestrel serve --port 8080

# Periodic health checking
kestrel heartbeat

# Show system status
kestrel status

# Start as daemon (background, double-fork, PID file + flock)
kestrel daemon start

# Check status (auto-cleans stale PID files from crashed instances)
kestrel daemon status

# Stop gracefully (SIGTERM, configurable grace period)
kestrel daemon stop

# Restart (stop + re-exec)
kestrel daemon restart

Environment variable KESTREL_HOME overrides the default config directory (~/.kestrel).

Configuration

# ~/.kestrel/config.toml

[providers.openai]
api_key = "${OPENAI_API_KEY}"
model = "gpt-4o"
# base_url = "https://api.openai.com/v1"   # optional: point to any OpenAI-compatible API

[providers.anthropic]
api_key = "${ANTHROPIC_API_KEY}"
model = "claude-sonnet-4-6"

[providers.openrouter]
api_key = "${OPENROUTER_API_KEY}"
model = "anthropic/claude-sonnet-4-6"

[providers.ollama]
base_url = "http://localhost:11434/v1"
model = "llama3"

[providers.deepseek]
api_key = "${DEEPSEEK_API_KEY}"
model = "deepseek-chat"

[providers.groq]
api_key = "${GROQ_API_KEY}"
model = "llama-3.3-70b-versatile"

[providers.gemini]
api_key = "${GEMINI_API_KEY}"
model = "gemini-2.0-flash"
# no_proxy = true              # skip proxy for domestic APIs (e.g. ZAI, Qwen)

[channels.telegram]
token = "${TELEGRAM_BOT_TOKEN}"
allowed_users = ["123456789"]         # optional: restrict to user IDs
admin_users = ["123456789"]           # optional: admin user IDs
enabled = true
streaming = false                     # telegram doesn't support token-by-token
# proxy = ""                          # optional: http/socks5 proxy URL

[channels.discord]
token = "${DISCORD_BOT_TOKEN}"
allowed_guilds = ["111222333"]        # optional: restrict to guild IDs
enabled = true

[channels.websocket]
enabled = true
listen_addr = "127.0.0.1:8090"
max_clients = 100
max_message_size = 1048576

[channels.websocket.auth]
required = true
token = "my-secret"

[agent]
model = "gpt-4o"
temperature = 0.7
max_tokens = 4096
max_iterations = 50                    # tool loop limit
streaming = true
tool_timeout = 120                     # seconds per tool execution
# system_prompt = "You are a helpful AI assistant."  # optional override
# workspace = "/tmp/workspace"                       # optional: default working directory

[dream]                                 # memory consolidation
enabled = true
interval_secs = 7200                    # consolidate every 2h
# model = "gpt-4o-mini"                # optional: use cheaper model

[heartbeat]
enabled = false
interval_secs = 1800

[cron]
enabled = false
# state_file = "~/.kestrel/cron_state.json"
tick_secs = 60

[security]
block_private_ips = true                # block RFC1918 by default
ssrf_whitelist = []                     # allowed IP ranges for outbound
blocked_networks = []                   # additional blocked ranges

[api]
host = "0.0.0.0"
port = 8080
allowed_origins = ["*"]                 # CORS origins
max_body_size = 10485760                # 10 MB

[daemon]
# pid_file = "~/.kestrel/kestrel.pid"
# log_dir = "~/.kestrel/logs"
# working_directory = "/"
grace_period_secs = 30

# optional: custom system prompt additions
# custom_instructions = "Always respond in English."

# optional: agent identity
# name = "Kestrel"

# optional: MCP tool servers
# [mcp_servers.filesystem]
# transport = "stdio"
# command = "mcp-filesystem"
# args = ["--root", "/data"]

# optional: custom provider endpoints
# [[custom_providers]]
# name = "my_provider"
# base_url = "https://my-api.com/v1"
# api_key = "key123"
# model_patterns = ["my-model"]

[notifications]
online_notify = true
# notify_chat_id = "-1001234567890"     # which chat receives the ping
online_message = "Kestrel v{version} online — {channel} connected"

Environment variables in values (${VAR}) are expanded at load time.

CLI Commands

Command Description
agent Interactive agent — send a message and get a response
gateway Start the gateway — connect to Telegram, Discord, etc.
serve OpenAI-compatible HTTP API server (Axum)
heartbeat Periodic health checking with auto-restart
health Show health check status
cron list List all cron jobs
cron status Show status of a specific cron job
config validate Validate the config.toml schema
config migrate Migrate Python kestrel config to kestrel format
setup Interactive configuration wizard
status Show current configuration and system status
daemon start/stop/restart/status Native Unix daemon: double-fork, PID file (flock), SIGTERM/SIGINT/SIGHUP, log rotation

Crates

Crate Description
kestrel-core Error types, constants, core types (MessageType, Platform)
kestrel-config YAML config loading, schema validation, path resolution
kestrel-bus Tokio broadcast-based async message bus
kestrel-session SQLite-backed session and conversation store
kestrel-security Network allowlist/denylist, command approval, SSRF protection
kestrel-providers LLM provider trait — OpenAI-compatible and Anthropic SSE streaming
kestrel-tools Tool registry + builtins (shell, web, fs, search, cron, spawn, message)
kestrel-agent Agent loop, context builder, memory, skills, hooks, sub-agents
kestrel-cron Tick-based cron scheduler with JSON state persistence
kestrel-heartbeat Health check registry, periodic task monitoring, auto-restart
kestrel-channels Platform adapters — Telegram, Discord, WebSocket — via ChannelManager
kestrel-api OpenAI-compatible HTTP API server (Axum)
kestrel-daemon Unix daemon: double-fork, PID file (flock), signal handling, file logging
kestrel-memory MemoryStore trait, HotStore (L1 in-memory), WarmStore/LanceDB (L2 vectors)
kestrel-skill Skill trait, TOML manifests, SkillRegistry, SkillCompiler
kestrel-learning LearningEvent bus, event processors, prompt assembly

Stats

Metric Value
Rust source files 151
Lines of Rust code ~105,200
Crates 16
Minimum Rust version 1.75

Build Performance

Tested on v0.1.1 (AMD Ryzen 7 6800U, 8GB RAM, Fedora 45, rustc 1.94.1):

Metric Value
Clean build (release) 7m 52s
Crates compiled 487
Binary size (stripped) 17M
Linker mold 2.41.0 via clang
LTO thin
Parallel jobs 4 (RAM-limited)

API

kestrel exposes an OpenAI-compatible HTTP API. Start with kestrel serve:

kestrel serve --port 8080

List models

curl http://localhost:8080/v1/models

Chat completion (non-streaming)

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is 2+2?"}
    ],
    "temperature": 0.7,
    "max_tokens": 256
  }'

Chat completion (SSE streaming)

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Health and readiness

curl http://localhost:8080/health    # detailed health snapshot
curl http://localhost:8080/ready     # 200=ready, 503=not ready (Kubernetes probe)

Endpoints

Endpoint Method Auth Description
/v1/chat/completions POST Bearer token (if configured) OpenAI-compatible chat (streaming + non-streaming)
/v1/models GET No List available models
/health GET No Health check with component details
/ready GET No Readiness probe (200/503)

Troubleshooting

Build error: protoc not found

LanceDB requires the protobuf compiler.

# Fedora / RHEL
sudo dnf install protobuf-compiler

# Ubuntu / Debian
sudo apt install protobuf-compiler
Build error: linker 'mold' not found

Remove or update the linker override in .cargo/config.toml. The shipped config does not require mold — this only happens if you added the optional mold config without installing it.

kestrel serve fails with "address already in use"

Another process is using port 8080. Either stop it or override the port:

kestrel serve --port 9090
# or set in config:
# api:
#   port: 9090
Clippy warnings on cargo clippy --workspace

Run cargo clippy --workspace --fix to auto-fix trivial issues. For persistent warnings, ensure you're on Rust 1.75+ and all dependencies are up to date (cargo update).

Development

# Build everything
cargo build --workspace

# Run all tests
cargo test --workspace

# Lint (must pass with 0 warnings)
cargo clippy --workspace -- -D warnings

# Format check
cargo fmt --all --check

# Quick compile check
cargo check

Design Principles

  • Thin harness, fat skills — Harness handles the loop, files, context, and safety. Complexity lives in skill files.
  • Latent vs deterministic — Judgment goes to the model; parsing and validation stay in code. Never mix the two.
  • Context engineering — JIT loading, compaction, and structured notes to stay within the context window.
  • Fewer, better tools — Consolidated operations with token-efficient returns and poka-yoke defaults.
  • LanceDB over SQLite FTS5 — Semantic vector search for memory and session recall.
  • TOML over YAML — Rust-native parsing for skill manifests and configuration.

Contributing

  1. Fork the repository and create a feature branch.
  2. Ensure cargo test --workspace and cargo clippy --workspace pass with zero warnings.
  3. Add tests for any new functionality — assertions must be deterministic (no LLM output in test expectations).
  4. Add /// doc comments on all pub functions.
  5. Open a pull request against main.

CI runs format checks, clippy, build, tests, and a security audit on every push.

License

This project is licensed under the MIT License.

About

Kestrel Agent — A fast, streaming-first Rust AI agent framework with built-in memory, skills, and self-evolution

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages