CloudLLM

CloudLLM is a batteries-included Rust toolkit for building intelligent agents with LLM integration, multi-protocol tool support, and multi-agent orchestration. It provides:

Agents with Tools: Create agents that connect to LLMs and execute actions through a flexible, multi-protocol tool system (local, remote MCP, Memory, custom protocols) with runtime hot-swapping,
Multi-Agent Orchestration: An orchestration engine supporting Parallel, RoundRobin, Moderated, Hierarchical, Debate, AnthropicAgentTeams, and Ralph collaboration patterns,
MentisDB: An effectively unbounded semantic memory primitive for agents, with SHA-256 hash-chained persistence, graph-based context resolution, and tamper-evident integrity verification, with a git-like skills registry repository.
Context Strategies: Pluggable strategies for handling context window exhaustion — Trim, SelfCompression (LLM writes its own save file), and NoveltyAware (entropy-based trigger),
Image Generation: Unified image generation across OpenAI (DALL-E), Grok, and Google Gemini with the simplified register_image_generation_tool() helper,
Server Deployment: Easy standalone MCP server creation via MCPServerBuilder with HTTP, authentication, and IP filtering,
Flexible Tool Creation: From simple Rust closures to advanced custom protocol implementations,
Event System: Real-time observability via EventHandler callbacks for LLM round-trips, tool calls, task completions, and orchestration lifecycle,
Stateful Sessions: A LLMSession for managing conversation history with context trimming and token accounting,
Provider Flexibility: Unified ClientWrapper trait for OpenAI, Claude, Gemini, Grok, and custom OpenAI-compatible endpoints.

The entire public API is documented with compilable examples—run cargo doc --open to browse the crate-level manual.

Installation
Quick Start
Multi-Agent Orchestration
Provider Wrappers
LLMSession: Stateful Conversations
Agents: Building Intelligent Workers with Tools
MentisDB: Persistent Agent Memory
Context Strategies: Managing Context Window Exhaustion
Agent::fork() — Lightweight Copies for Parallel Execution
Runtime Tool Hot-Swapping
Event System: Real-Time Agent & Orchestration Observability
Tool Registry: Multi-Protocol Tool Access
Native Tool Calling (v0.11.1)
Deploying Tool Servers with MCPServerBuilder
Creating Tools: Simple to Advanced
Image Generation
Examples
Support & Contributing

Installation

Add CloudLLM to your project:

[dependencies]
cloudllm = "0.13.0"

The crate targets tokio 1.x and Rust 1.70+.

Quick Start

CloudLLM has two core abstractions for talking to LLMs:

Abstraction	What it is	When to use it
LLMSession	Stateful conversation wrapper around any `ClientWrapper`. Maintains rolling history with automatic context trimming and token accounting.	Simple chat bots, Q&A, any 1-on-1 conversation with an LLM.
Agent	Wraps LLMSession with an identity (name, expertise, personality), optional tools, persistent MentisDB memory, and pluggable context strategies. Can execute actions, not just converse.	Tool-using assistants, orchestrated multi-agent teams, autonomous workflows.

Think of it this way: LLMSession is the foundation; Agent builds on top of it.

1. LLMSession — stateful conversation (OpenAI)

use std::sync::Arc;
use cloudllm::{LLMSession, Role};
use cloudllm::clients::openai::{Model, OpenAIClient};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Arc::new(OpenAIClient::new_with_model_enum(
        &std::env::var("OPEN_AI_SECRET")?, Model::GPT41Mini,
    ));

    let mut session = LLMSession::new(client, "You are a concise tutor.".into(), 8_192);

    let reply = session
        .send_message(Role::User, "What is ownership in Rust?".into(), None)
        .await?;

    println!("{}", reply.content);
    println!("Tokens used: {:?}", session.token_usage());
    Ok(())
}

2. Agent — identity + tools (Claude)

An Agent wraps a client just like LLMSession, but adds a name, expertise, personality, and (optionally) tools. Here the agent uses Anthropic Claude and can answer questions using its personality and expertise context:

use std::sync::Arc;
use cloudllm::Agent;
use cloudllm::clients::claude::{ClaudeClient, Model};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let client = Arc::new(ClaudeClient::new_with_model_enum(
        &std::env::var("ANTHROPIC_KEY")?, Model::ClaudeSonnet46,
    ));

    let agent = Agent::new("tutor", "Rust Tutor", client)
        .with_expertise("Rust programming, ownership, lifetimes")
        .with_personality("Patient teacher who uses short analogies");

    // generate() sends a one-shot prompt through the agent's identity context
    let answer = agent
        .generate(
            "You are a helpful programming tutor.",
            "Explain borrowing vs cloning in two sentences.",
            &[],  // no prior conversation history
        )
        .await?;

    println!("{}", answer);
    Ok(())
}

3. Streaming tokens in real time (Grok)

Any ClientWrapper supports streaming. Here we use xAI Grok:

use cloudllm::{LLMSession, Role};
use cloudllm::clients::grok::{GrokClient, Model};
use futures_util::StreamExt;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Arc::new(GrokClient::new_with_model_enum(
        &std::env::var("XAI_KEY")?, Model::Grok3Mini,
    ));
    let mut session = LLMSession::new(client, "You think out loud.".into(), 16_000);

    if let Some(mut stream) = session
        .send_message_stream(Role::User, "Explain type erasure.".into(), None)
        .await? {
        while let Some(chunk) = stream.next().await {
            let chunk = chunk?;
            print!("{}", chunk.content);
            if let Some(reason) = chunk.finish_reason {
                println!("\n<terminated: {reason}>");
            }
        }
    }

    Ok(())
}

Multi-Agent Orchestration

The orchestration module coordinates conversations between multiple LLM agents. Each agent can have its own provider, expertise, personality, and tool access. Choose from six collaboration patterns depending on your use case.

Orchestration Modes

Mode	Description	Best For
Parallel	All agents respond simultaneously; results are aggregated	Fast fan-out queries, getting diverse perspectives
RoundRobin	Agents take sequential turns, each building on previous responses	Iterative refinement, structured review
Moderated	Agents propose ideas, a moderator synthesizes the final answer	Consensus building, curated outputs
Hierarchical	Lead agent coordinates; specialists handle specific aspects	Complex tasks with delegation
Debate	Agents discuss and challenge until convergence is reached	Critical analysis, stress-testing ideas
AnthropicAgentTeams	Decentralized task pool coordination; agents autonomously claim and complete work	Large task pools (>8 tasks), parallel work distribution, agent autonomy
Ralph	Autonomous iterative loop working through a PRD task list	Multi-step builds, code generation, structured project work

Basic Example: RoundRobin

use std::sync::Arc;

use cloudllm::orchestration::{Agent, Orchestration, OrchestrationMode};
use cloudllm::clients::openai::{Model, OpenAIClient};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let key = std::env::var("OPEN_AI_SECRET")?;

    let architect = Agent::new(
        "architect",
        "System Architect",
        Arc::new(OpenAIClient::new_with_model_enum(&key, Model::GPT4o)),
    )
    .with_expertise("Distributed systems")
    .with_personality("Pragmatic, direct");

    let tester = Agent::new(
        "qa",
        "QA Lead",
        Arc::new(OpenAIClient::new_with_model_enum(&key, Model::GPT41Mini)),
    )
    .with_expertise("Test automation")
    .with_personality("Sceptical, detail-oriented");

    let mut orchestration = Orchestration::new("design-review", "Deployment Review")
        .with_mode(OrchestrationMode::RoundRobin)
        .with_system_context("Collaboratively review the proposed architecture.");

    orchestration.add_agent(architect)?;
    orchestration.add_agent(tester)?;

    let outcome = orchestration
        .run("Evaluate whether the blue/green rollout plan is sufficient.", 2)
        .await?;

    for msg in outcome.messages {
        if let Some(name) = msg.agent_name {
            println!("{name}: {}", msg.content);
        }
    }

    Ok(())
}

AnthropicAgentTeams: Decentralized Task Coordination

AnthropicAgentTeams is a decentralized orchestration mode where agents autonomously discover, claim, and complete tasks from a shared Memory pool. Unlike other modes where the orchestrator assigns work, agents coordinate directly via Memory keys—the orchestrator only manages the iteration loop while agents self-select tasks.

Key features:

Decentralized coordination: Agents autonomously select work from a shared task pool via Memory
Task-based work items: Structured WorkItem objects with id, description, and acceptance criteria
Memory-based state: Tasks stored as teams:<pool_id>:unclaimed/claimed/completed:<task_id> keys
Autonomous claiming: Agents discover available tasks, claim them, complete work, and report results
Progress tracking: convergence_score reports task completion fraction (0.0 to 1.0)
Scalability: Better suited for large task pools (>8 tasks) than Ralph's checklist approach
Mixed providers: Works seamlessly with agents using different LLM providers (OpenAI, Claude, etc.)

use std::sync::Arc;

use cloudllm::orchestration::{Orchestration, OrchestrationMode, WorkItem};
use cloudllm::clients::claude::{ClaudeClient, Model};
use cloudllm::Agent;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let key = std::env::var("ANTHROPIC_KEY")?;
    let make_client = || Arc::new(ClaudeClient::new_with_model_enum(&key, Model::ClaudeSonnet46));

    let researcher = Agent::new("researcher", "Research Agent", make_client())
        .with_expertise("Scientific literature, data gathering");
    let analyst = Agent::new("analyst", "Analysis Agent", make_client())
        .with_expertise("Data synthesis, theme extraction");
    let writer = Agent::new("writer", "Writing Agent", make_client())
        .with_expertise("Technical writing, documentation");
    let reviewer = Agent::new("reviewer", "Review Agent", make_client())
        .with_expertise("Quality assurance, peer review");

    let tasks = vec![
        WorkItem::new(
            "research_nmn",
            "Research phase — NMN+ mechanisms and pathways",
            "Gather and summarize current scientific literature on NAD+ boosting",
        ),
        WorkItem::new(
            "analyze_longevity",
            "Analysis phase — longevity effects",
            "Synthesize findings on aging reversal and lifespan extension",
        ),
        WorkItem::new(
            "write_summary",
            "Writing phase — draft summary report",
            "Draft 2-3 page synthesis of all findings",
        ),
        WorkItem::new(
            "final_review",
            "Quality review — accuracy and completeness",
            "Peer review for scientific accuracy and identify gaps",
        ),
    ];

    let mut orch = Orchestration::new("research-team", "Research Team")
        .with_mode(OrchestrationMode::AnthropicAgentTeams {
            pool_id: "research-pool".to_string(),
            tasks,
            max_iterations: 4,
        })
        .with_system_context("You are a specialized agent in a coordinated team. \
                             Claim tasks from the shared pool and complete them autonomously.")
        .with_max_tokens(128_000);

    orch.add_agent(researcher)?;
    orch.add_agent(analyst)?;
    orch.add_agent(writer)?;
    orch.add_agent(reviewer)?;

    let result = orch.run("Research NMN+ for longevity and synthesize findings", 1).await?;

    println!("Iterations: {}",  result.round);
    println!("Complete: {}",    result.is_complete);
    println!("Progress: {:.0}%", result.convergence_score.unwrap_or(0.0) * 100.0);
    println!("Tokens: {}",       result.total_tokens_used);

    Ok(())
}

How It Works: Agents use the Memory tool to coordinate. Each iteration, agents:

LIST unclaimed tasks from Memory (teams:<pool_id>:unclaimed:*)
GET task descriptions and acceptance criteria
PUT claim marker (teams:<pool_id>:claimed:<task_id> → <agent_id>:<timestamp>)
Work on the task using their expertise and tools
PUT completion result (teams:<pool_id>:completed:<task_id> → <result>)

The orchestration terminates when all tasks are completed or max_iterations is reached.

See examples/anthropic_teams.rs for a full working example with 4 agents and 8 tasks using mixed LLM providers (OpenAI + Claude). Also see examples/breakout_game_agent_teams.rs for a complete Atari Breakout game built with decentralized coordination.

Ralph: Autonomous PRD-Driven Loop

Ralph (named after Ralph Wiggum) is an autonomous iterative orchestration mode where agents work through a structured PRD (Product Requirements Document) task list. Each iteration presents agents with the current task checklist. Agents signal completion by including [TASK_COMPLETE:task_id] markers in their responses. The loop ends when all tasks are done or max_iterations is reached.

Key features:

PRD-driven: Structured RalphTask items with id, title, and description
Completion detection: Agents include [TASK_COMPLETE:task_id] markers
Progress tracking: convergence_score reports task completion fraction (0.0 to 1.0)
History trimming: Conversation history is automatically trimmed to fit within max_tokens, keeping the most recent messages
Live progress: Event handler shows real-time iteration progress, LLM round-trips, tool calls, and task completions (see Event System)

use std::sync::Arc;

use cloudllm::orchestration::{Orchestration, OrchestrationMode, RalphTask};
use cloudllm::clients::claude::{ClaudeClient, Model};
use cloudllm::Agent;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let key = std::env::var("ANTHROPIC_KEY")?;
    let make_client = || Arc::new(ClaudeClient::new_with_model_enum(&key, Model::ClaudeSonnet46));

    let frontend = Agent::new("frontend", "Frontend Dev", make_client())
        .with_expertise("HTML, CSS, Canvas");
    let backend = Agent::new("backend", "Backend Dev", make_client())
        .with_expertise("JavaScript, game logic");

    let tasks = vec![
        RalphTask::new("html",  "HTML Structure", "Create the HTML boilerplate and canvas"),
        RalphTask::new("loop",  "Game Loop",      "Implement requestAnimationFrame game loop"),
        RalphTask::new("input", "Controls",       "Add keyboard input for the paddle"),
    ];

    let mut orch = Orchestration::new("game-builder", "Game Builder")
        .with_mode(OrchestrationMode::Ralph {
            tasks,
            max_iterations: 5,
        })
        .with_system_context("Build a game. Output full HTML. Mark done with [TASK_COMPLETE:id].")
        .with_max_tokens(180_000);

    orch.add_agent(frontend)?;
    orch.add_agent(backend)?;

    let result = orch.run("Build a Pong game in a single index.html", 1).await?;

    println!("Iterations: {}",  result.round);
    println!("Complete: {}",    result.is_complete);
    println!("Progress: {:.0}%", result.convergence_score.unwrap_or(0.0) * 100.0);
    println!("Tokens: {}",     result.total_tokens_used);

    Ok(())
}

Starter HTML + Read-Modify-Write Pattern: Both breakout examples seed a working starter HTML skeleton to disk and Memory (current_game_html key) before orchestration begins. Agents follow a read-modify-write loop: READ the current HTML from Memory, MODIFY it to implement their assigned feature, then WRITE the updated HTML back via a custom write_game_file tool (which persists to both disk and Memory). This ensures every agent builds on the latest cumulative output and there is always a playable game on disk.

See examples/breakout_game_ralph.rs for a full working example that orchestrates 4 agents through 18 PRD tasks to produce a complete Atari Breakout game with multi-hit bricks, powerups, chiptune music, particle effects, and mobile controls. Also see examples/breakout_game_agent_teams.rs for the same game built with decentralized AnthropicAgentTeams coordination.

For a deep dive into all collaboration modes, read ORCHESTRATION_TUTORIAL.md.

Provider wrappers

CloudLLM ships wrappers for popular OpenAI-compatible services:

Provider	Module	Notable constructors
OpenAI	`cloudllm::clients::openai`	`OpenAIClient::new_with_model_enum`, `OpenAIClient::new_with_base_url`
Anthropic Claude	`cloudllm::clients::claude`	`ClaudeClient::new_with_model_enum`
Google Gemini	`cloudllm::clients::gemini`	`GeminiClient::new_with_model_enum`
xAI Grok	`cloudllm::clients::grok`	`GrokClient::new_with_model_enum`

Providers share the ClientWrapper contract, so you can swap them without changing downstream code. As of v0.11.1, all four providers (OpenAI, Claude, Grok, Gemini) support native tool calling via the tools: Option<Vec<ToolDefinition>> parameter on send_message.

use cloudllm::ClientWrapper;
use cloudllm::clients::claude::{ClaudeClient, Model};
use cloudllm::client_wrapper::{Message, Role};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let key = std::env::var("ANTHROPIC_KEY")?;
    let claude = ClaudeClient::new_with_model_enum(&key, Model::ClaudeSonnet4);

    let response = claude
        .send_message(
            &[Message { role: Role::User, content: "Summarise rice fermentation.".into(), tool_calls: vec![] }],
            None,
            None,
        )
        .await?;

    println!("{}", response.content);
    Ok(())
}

Every wrapper exposes token accounting via ClientWrapper::get_last_usage.

LLMSession: Stateful Conversations (The Foundation)

LLMSession is the core building block—it maintains conversation history with automatic context trimming and token accounting. Use it for simple stateful conversations with any LLM provider:

use std::sync::Arc;
use cloudllm::{LLMSession, Role};
use cloudllm::clients::openai::{OpenAIClient, Model};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Arc::new(OpenAIClient::new_with_model_enum(
        &std::env::var("OPEN_AI_SECRET")?,
        Model::GPT41Mini
    ));

    let mut session = LLMSession::new(client, "You are helpful.".into(), 8_192);

    let reply = session
        .send_message(Role::User, "Tell me about Rust.".into(), None)
        .await?;

    println!("Assistant: {}", reply.content);
    println!("Tokens used: {:?}", session.token_usage());
    Ok(())
}

Agents: Building Intelligent Workers with Tools

Agents extend LLMSession by adding identity, expertise, and optional tools. They're the primary way to build sophisticated LLM interactions where you need the agent to take actions beyond conversation.

The example below creates a single agent with four tools attached: the built-in Calculator, a shared Memory store, image generation via OpenAI, and a custom Fibonacci tool — all on one CustomToolProtocol:

use std::sync::Arc;
use cloudllm::Agent;
use cloudllm::clients::openai::{OpenAIClient, Model};
use cloudllm::tool_protocol::{ToolMetadata, ToolParameter, ToolParameterType, ToolResult, ToolRegistry};
use cloudllm::tool_protocols::{CustomToolProtocol, MemoryProtocol};
use cloudllm::tools::{Calculator, Memory};
use cloudllm::cloudllm::image_generation::register_image_generation_tool;
use cloudllm::cloudllm::{ImageGenerationProvider, new_image_generation_client};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let api_key = std::env::var("OPEN_AI_SECRET")?;

    let client = Arc::new(OpenAIClient::new_with_model_enum(&api_key, Model::GPT41Mini));

    // -- Tool protocol (all tools register here) ----------------------------
    let protocol = Arc::new(CustomToolProtocol::new());

    // 1. Calculator — wraps the built-in evaluator
    let calc = Calculator::new();
    protocol.register_async_tool(
        ToolMetadata::new("calculator", "Evaluate a math expression")
            .with_parameter(
                ToolParameter::new("expr", ToolParameterType::String)
                    .with_description("Math expression, e.g. sqrt(2) + mean([1,2,3])")
                    .required(),
            ),
        Arc::new(move |params| {
            let calc = calc.clone();
            Box::pin(async move {
                let expr = params["expr"].as_str().unwrap_or("0");
                match calc.evaluate(expr).await {
                    Ok(val) => Ok(ToolResult::success(serde_json::json!({ "result": val }))),
                    Err(e)  => Ok(ToolResult::failure(e.to_string())),
                }
            })
        }),
    ).await;

    // 2. Image generation — one-liner helper registers the tool
    let image_client = new_image_generation_client(ImageGenerationProvider::OpenAI, &api_key)?;
    register_image_generation_tool(&protocol, image_client).await?;

    // 3. Custom tool — Fibonacci sequence (sync closure, no boilerplate)
    protocol.register_tool(
        ToolMetadata::new("fibonacci", "Return the Nth Fibonacci number")
            .with_parameter(
                ToolParameter::new("n", ToolParameterType::Number)
                    .with_description("Index (0-based)")
                    .required(),
            ),
        Arc::new(|params| {
            let n = params["n"].as_u64().unwrap_or(0) as usize;
            let mut a: u64 = 0;
            let mut b: u64 = 1;
            for _ in 0..n {
                let tmp = a + b;
                a = b;
                b = tmp;
            }
            Ok(ToolResult::success(serde_json::json!({ "fib": a })))
        }),
    ).await;

    // -- Build the registry and the agent -----------------------------------
    // Memory lives in its own protocol so multiple agents can share it
    let memory = Arc::new(Memory::new());
    let mut registry = ToolRegistry::empty();
    registry.add_protocol("tools",  protocol).await?;
    registry.add_protocol("memory", Arc::new(MemoryProtocol::new(memory))).await?;

    let agent = Agent::new("assistant", "Research Assistant", client)
        .with_expertise("Math, memory, image generation, and Fibonacci numbers")
        .with_personality("Thorough and methodical")
        .with_tools(registry);

    println!("Agent '{}' ready with {} tools", agent.name, 4);
    Ok(())
}

Key patterns shown above:

Pattern	Used For
`register_image_generation_tool()`	One-line built-in tool registration
`protocol.register_tool(metadata, closure)`	Sync custom tool (Fibonacci)
`protocol.register_async_tool(metadata, closure)`	Async tool wrapping a built-in (Calculator)
`MemoryProtocol::new(memory)`	Protocol wrapper for built-in Memory
`ToolRegistry::empty()` + `add_protocol()`	Multi-protocol registry
`agent.with_tools(registry)`	Attach tools to an agent

MentisDB: Persistent Agent Memory

MentisDB is a sibling project — a standalone durable-memory crate for AI agents. CloudLLM re-exports its public API so agents can use persistent, hash-chained memory without any extra setup:

use cloudllm::MentisDb;
use cloudllm::{ThoughtInput, ThoughtRole, ThoughtType};
use std::path::PathBuf;

fn main() -> std::io::Result<()> {
    let mut chain = MentisDb::open_with_key(PathBuf::from("chains"), "borganism-brain")?;

    chain.append_thought(
        "astro",
        ThoughtInput::new(ThoughtType::Decision, "Switched MentisDB to its own repo.")
            .with_agent_name("Astro")
            .with_agent_owner("@gubatron")
            .with_importance(0.95),
    )?;

    assert!(chain.verify_integrity());
    println!("{}", chain.to_memory_markdown(None));
    Ok(())
}

Agents can attach a chain via Agent::with_mentisdb or resume from one with Agent::resume_from_latest.

For the full API reference, daemon (mentisdbd) usage, MCP/REST contract, and versioned skill registry, see the MentisDB repository.

Context Strategies: Managing Context Window Exhaustion

The ContextStrategy trait lets you plug in different policies for what happens when an agent's conversation history approaches its token budget.

Strategy	Trigger	Action
TrimStrategy (default)	Token ratio > 0.85	No-op — LLMSession's built-in trimming handles it
SelfCompressionStrategy	Token ratio > 0.80	LLM writes a structured save file; persisted to MentisDB
NoveltyAwareStrategy	High pressure always; moderate pressure + low novelty	Delegates to inner strategy (typically SelfCompression)

use cloudllm::Agent;
use cloudllm::context_strategy::{NoveltyAwareStrategy, SelfCompressionStrategy};
use cloudllm::clients::openai::OpenAIClient;
use std::sync::Arc;

let agent = Agent::new(
    "analyst", "Analyst",
    Arc::new(OpenAIClient::new_with_model_string("key", "gpt-4o")),
)
.context_collapse_strategy(Box::new(
    NoveltyAwareStrategy::new(Box::new(SelfCompressionStrategy::default()))
        .with_thresholds(0.85, 0.65, 0.25),
));

The strategy can also be swapped at runtime via agent.set_context_collapse_strategy(...).

Agent::fork() — Lightweight Copies for Parallel Execution

Agent is intentionally not Clone (its LLMSession contains a bumpalo arena). Instead, use fork() to create a lightweight copy that shares the same tool registry and thought chain (via Arc) but has a fresh, empty session:

use cloudllm::Agent;
use cloudllm::clients::openai::OpenAIClient;
use std::sync::Arc;

let agent = Agent::new(
    "analyst", "Analyst",
    Arc::new(OpenAIClient::new_with_model_string("key", "gpt-4o")),
).with_expertise("Cloud Architecture");

// Fork for parallel execution
let forked = agent.fork();
assert_eq!(forked.id, agent.id);
assert_eq!(forked.expertise, agent.expertise);
// forked has an empty session but shares tools and thought chain

Orchestration modes (Parallel, Hierarchical) use fork() internally when they need temporary per-task agents.

Runtime Tool Hot-Swapping

The tool registry is wrapped in Arc<RwLock<ToolRegistry>>, allowing protocols to be added or removed while an agent is running:

use cloudllm::Agent;
use cloudllm::tool_protocols::CustomToolProtocol;
use cloudllm::clients::openai::OpenAIClient;
use std::sync::Arc;

# async {
let agent = Agent::new(
    "a1", "Agent",
    Arc::new(OpenAIClient::new_with_model_string("key", "gpt-4o")),
);

// Add a protocol at runtime
agent.add_protocol("custom", Arc::new(CustomToolProtocol::new())).await.unwrap();

// List available tools
let tools = agent.list_tools().await;
println!("Tools: {:?}", tools);

// Remove it later
agent.remove_protocol("custom").await;
# };

For sharing a mutable registry across agents, use with_shared_tools():

use cloudllm::Agent;
use cloudllm::tool_protocol::ToolRegistry;
use cloudllm::clients::openai::OpenAIClient;
use std::sync::Arc;
use tokio::sync::RwLock;

let shared = Arc::new(RwLock::new(ToolRegistry::empty()));
let client = Arc::new(OpenAIClient::new_with_model_string("key", "gpt-4o"));

let agent_a = Agent::new("a", "Agent A", client.clone())
    .with_shared_tools(shared.clone());
let agent_b = Agent::new("b", "Agent B", client)
    .with_shared_tools(shared.clone());
// Adding a protocol via agent_a is visible to agent_b

Event System: Real-Time Agent & Orchestration Observability

The event module provides a callback-based observability layer for agents and orchestrations. Implement the EventHandler trait to receive real-time notifications about LLM round-trips, tool calls, task completions, and more.

This replaces guessing what's happening during long-running orchestrations — you'll see exactly when each agent starts thinking, which tools it calls, and when the LLM responds.

EventHandler Trait

use cloudllm::event::{AgentEvent, EventHandler, OrchestrationEvent};
use async_trait::async_trait;

struct MyHandler;

#[async_trait]
impl EventHandler for MyHandler {
    async fn on_agent_event(&self, event: &AgentEvent) {
        // Handle agent-level events (LLM calls, tool usage, etc.)
        println!("Agent: {:?}", event);
    }
    async fn on_orchestration_event(&self, event: &OrchestrationEvent) {
        // Handle orchestration-level events (rounds, task completion, etc.)
        println!("Orchestration: {:?}", event);
    }
}

Both methods have default no-op implementations, so you only need to override the events you care about. For example, to only observe orchestration-level progress:

# use cloudllm::event::{EventHandler, OrchestrationEvent};
# use async_trait::async_trait;
struct ProgressLogger;

#[async_trait]
impl EventHandler for ProgressLogger {
    async fn on_orchestration_event(&self, event: &OrchestrationEvent) {
        match event {
            OrchestrationEvent::RunCompleted { rounds, total_tokens, is_complete, .. } => {
                println!("Done! {} rounds, {} tokens, complete={}", rounds, total_tokens, is_complete);
            }
            _ => {}
        }
    }
}

AgentEvent Variants

Events emitted by an Agent during its lifecycle. Every variant carries agent_id and agent_name for identification.

Variant	Fields	When Emitted
`SendStarted`	`message_preview`	At the start of `send()` or `generate_with_tokens()`
`SendCompleted`	`tokens_used`, `tool_calls_made`, `response_length`	When `send()` or `generate_with_tokens()` finishes successfully
`LLMCallStarted`	`iteration`	Before each LLM round-trip (first call + each tool-loop follow-up)
`LLMCallCompleted`	`iteration`, `tokens_used`, `response_length`	After each LLM round-trip completes
`ToolCallDetected`	`tool_name`, `parameters`, `iteration`	When a tool call is parsed from the LLM response
`ToolExecutionCompleted`	`tool_name`, `parameters`, `success`, `error`, `result`, `iteration`	After a tool finishes executing
`ToolMaxIterationsReached`	(none extra)	When the tool loop hits its iteration cap
`ThoughtCommitted`	`thought_type`	After a thought is appended to the MentisDB
`ProtocolAdded`	`protocol_name`	When a new tool protocol is added to the agent
`ProtocolRemoved`	`protocol_name`	When a tool protocol is removed
`SystemPromptSet`	(none extra)	When the agent's system prompt is set or replaced
`MessageReceived`	(none extra)	When a message is injected into the agent's session
`Forked`	(none extra)	When `fork()` creates a lightweight copy (fresh session)
`ForkedWithContext`	(none extra)	When `fork_with_context()` copies the agent with history

The LLMCallStarted/LLMCallCompleted pair is especially useful for understanding latency — during orchestration you'll see exactly when each agent is waiting on the LLM and when the response arrives.

OrchestrationEvent Variants

Events emitted by an Orchestration during a run(). Each variant carries orchestration_id for identification.

Variant	Fields	When Emitted
`RunStarted`	`orchestration_name`, `mode`, `agent_count`	At the start of `run()`
`RunCompleted`	`orchestration_name`, `rounds`, `total_tokens`, `is_complete`	When `run()` finishes
`RoundStarted`	`round`	At the start of each round/iteration
`RoundCompleted`	`round`	At the end of each round/iteration
`AgentSelected`	`agent_id`, `agent_name`, `reason`	When an agent is chosen to respond (Moderated, Hierarchical modes)
`AgentResponded`	`agent_id`, `agent_name`, `tokens_used`, `response_length`	After an agent responds successfully
`AgentFailed`	`agent_id`, `agent_name`, `error`	When an agent encounters an error
`ConvergenceChecked`	`round`, `score`, `threshold`, `converged`	After similarity check in Debate mode
`RalphIterationStarted`	`iteration`, `max_iterations`, `tasks_completed`, `tasks_total`	At the start of each RALPH iteration
`RalphTaskCompleted`	`agent_id`, `agent_name`, `task_ids`, `tasks_completed_total`, `tasks_total`	When a RALPH task is completed by an agent

Registering an Event Handler

Wrap your handler in Arc and register it via the builder pattern:

On an Agent:

use std::sync::Arc;
use cloudllm::Agent;
use cloudllm::event::EventHandler;
use cloudllm::clients::openai::OpenAIClient;

# fn example(handler: Arc<dyn EventHandler>) {
let agent = Agent::new("a1", "Agent", Arc::new(
    OpenAIClient::new_with_model_string("key", "gpt-4o"),
))
.with_event_handler(handler);  // builder pattern
# }

You can also set or replace the handler at runtime:

# use std::sync::Arc;
# use cloudllm::Agent;
# use cloudllm::event::EventHandler;
# use cloudllm::clients::openai::OpenAIClient;
# fn example(handler: Arc<dyn EventHandler>) {
# let mut agent = Agent::new("a1", "Agent", Arc::new(
#     OpenAIClient::new_with_model_string("key", "gpt-4o"),
# ));
agent.set_event_handler(handler);  // runtime mutation
# }

On an Orchestration:

use std::sync::Arc;
use cloudllm::orchestration::{Orchestration, OrchestrationMode};
use cloudllm::event::EventHandler;

# fn example(handler: Arc<dyn EventHandler>) {
let orchestration = Orchestration::new("id", "Name")
    .with_mode(OrchestrationMode::RoundRobin)
    .with_event_handler(handler);  // auto-propagates to agents added later
# }

When you register an event handler on an Orchestration, it is automatically propagated to every agent added via add_agent(). This means agents emit their own AgentEvents through the same handler, giving you a unified stream of both agent-level and orchestration-level events.

Full Example: Real-Time Progress Display

This example (adapted from examples/breakout_game_ralph.rs) shows a handler that tracks elapsed time and pretty-prints events as they happen:

use async_trait::async_trait;
use cloudllm::event::{AgentEvent, EventHandler, OrchestrationEvent};
use std::time::Instant;
use std::sync::Arc;

struct ProgressHandler {
    start: Instant,
}

impl ProgressHandler {
    fn new() -> Self { Self { start: Instant::now() } }

    fn elapsed(&self) -> String {
        let secs = self.start.elapsed().as_secs();
        format!("{:02}:{:02}", secs / 60, secs % 60)
    }
}

#[async_trait]
impl EventHandler for ProgressHandler {
    async fn on_agent_event(&self, event: &AgentEvent) {
        match event {
            AgentEvent::SendStarted { agent_name, message_preview, .. } => {
                let preview = &message_preview[..80.min(message_preview.len())];
                println!("  [{}] >> {} thinking... ({}...)", self.elapsed(), agent_name, preview);
            }
            AgentEvent::SendCompleted { agent_name, tokens_used, response_length, tool_calls_made, .. } => {
                let tokens = tokens_used.as_ref().map(|u| u.total_tokens).unwrap_or(0);
                println!("  [{}] << {} responded ({} chars, {} tokens, {} tool calls)",
                    self.elapsed(), agent_name, response_length, tokens, tool_calls_made);
            }
            AgentEvent::LLMCallStarted { agent_name, iteration, .. } => {
                println!("  [{}]    {} sending to LLM (round {})...", self.elapsed(), agent_name, iteration);
            }
            AgentEvent::LLMCallCompleted { agent_name, iteration, tokens_used, response_length, .. } => {
                let tokens = tokens_used.as_ref()
                    .map(|u| format!("{} tokens", u.total_tokens))
                    .unwrap_or_else(|| "no token info".to_string());
                println!("  [{}]    {} LLM round {} complete ({} chars, {})",
                    self.elapsed(), agent_name, iteration, response_length, tokens);
            }
            AgentEvent::ToolCallDetected { agent_name, tool_name, parameters, iteration, .. } => {
                println!("  [{}]    {} calling tool '{}' (iter {}), params={}",
                    self.elapsed(), agent_name, tool_name, iteration,
                    serde_json::to_string(parameters).unwrap_or_default());
            }
            AgentEvent::ToolExecutionCompleted { agent_name, tool_name, success, error, result, .. } => {
                if *success {
                    let result_preview = result.as_ref().map(|r| {
                        let s = serde_json::to_string(r).unwrap_or_default();
                        if s.len() > 200 { format!("{}...", &s[..200]) } else { s }
                    }).unwrap_or_default();
                    println!("  [{}]    {} tool '{}' succeeded → {}", self.elapsed(), agent_name, tool_name, result_preview);
                } else {
                    println!("  [{}]    {} tool '{}' FAILED: {}",
                        self.elapsed(), agent_name, tool_name, error.as_deref().unwrap_or("unknown"));
                }
            }
            _ => {}
        }
    }

    async fn on_orchestration_event(&self, event: &OrchestrationEvent) {
        match event {
            OrchestrationEvent::RunStarted { orchestration_name, mode, agent_count, .. } => {
                println!("\n{}\n  {} — mode={}, agents={}\n{}",
                    "=".repeat(70), orchestration_name, mode, agent_count, "=".repeat(70));
            }
            OrchestrationEvent::RalphIterationStarted { iteration, max_iterations, tasks_completed, tasks_total, .. } => {
                println!("\n  RALPH Iteration {}/{} — {}/{} tasks complete",
                    iteration, max_iterations, tasks_completed, tasks_total);
            }
            OrchestrationEvent::RalphTaskCompleted { agent_name, task_ids, tasks_completed_total, tasks_total, .. } => {
                println!("  [{}] *** {} completed tasks: [{}] — progress: {}/{}",
                    self.elapsed(), agent_name, task_ids.join(", "), tasks_completed_total, tasks_total);
            }
            OrchestrationEvent::AgentFailed { agent_name, error, .. } => {
                println!("  [{}] !!! {} FAILED: {}", self.elapsed(), agent_name, error);
            }
            OrchestrationEvent::RunCompleted { rounds, total_tokens, is_complete, .. } => {
                println!("\n{}\n  Run complete — {} rounds, {} tokens, complete={}\n{}",
                    "=".repeat(70), rounds, total_tokens, is_complete, "=".repeat(70));
            }
            _ => {}
        }
    }
}

// Register on an orchestration (auto-propagates to all agents):
// let handler = Arc::new(ProgressHandler::new());
// let orchestration = Orchestration::new("id", "Name")
//     .with_event_handler(handler);

Sample output during a RALPH run:

======================================================================
  Breakout Game RALPH Orchestration — mode=Ralph, agents=4
======================================================================

  RALPH Iteration 1/5 — 0/10 tasks complete
  [00:00] >> Game Architect thinking... (Build a complete Atari Breakout game...)
  [00:00]    Game Architect sending to LLM (round 1)...
  [00:22]    Game Architect LLM round 1 complete (8923 chars, 3241 tokens)
  [00:22]    Game Architect calling tool 'write_game_file' (iter 1), params={"filename":"breakout_game.html",...}
  [00:22]    Game Architect tool 'write_game_file' succeeded
  [00:22]    Game Architect sending to LLM (round 2)...
  [00:35]    Game Architect LLM round 2 complete (412 chars, 158 tokens)
  [00:35] << Game Architect responded (412 chars, 3399 tokens, 1 tool calls)
  [00:35] *** Game Architect completed tasks: [html_structure, game_loop] — progress: 2/10
  [00:35] >> Game Programmer thinking... (Build a complete Atari Breakout game...)
  ...

Tool Registry: Multi-Protocol Tool Access

Agents access tools through the ToolRegistry, which supports multiple simultaneous protocols. Use local tools, remote MCP servers, persistent Memory, or custom implementations—all transparently:

Adding Tools to a Registry

use std::sync::Arc;
use cloudllm::tool_protocol::ToolRegistry;
use cloudllm::tool_protocols::{CustomToolProtocol, McpClientProtocol};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create empty registry for multiple protocols
    let mut registry = ToolRegistry::empty();

    // Add local tools (Rust closures)
    let local = Arc::new(CustomToolProtocol::new());
    registry.add_protocol("local", local).await?;

    // Add remote MCP servers
    let github = Arc::new(McpClientProtocol::new("http://localhost:8081".to_string()));
    registry.add_protocol("github", github).await?;

    let calculator = Arc::new(McpClientProtocol::new("http://localhost:8082".to_string()));
    registry.add_protocol("calculator", calculator).await?;

    // Agent using this registry accesses all tools transparently!
    Ok(())
}

Key Benefits:

Local + Remote: Mix tools from different sources in a single agent
Transparent Routing: Registry automatically routes calls to the correct protocol
Dynamic Management: Add/remove protocols at runtime
Backward Compatible: Existing single-protocol code still works

Registry Modes

Multi-Protocol (New agents):

let mut registry = ToolRegistry::empty();
registry.add_protocol("name", protocol).await?;

Single-Protocol (Existing code):

let protocol = Arc::new(CustomToolProtocol::new());
let registry = ToolRegistry::new(protocol);

Native Tool Calling (v0.11.1)

Starting with v0.11.1, agents route tool calls through the provider's native function-calling API rather than relying solely on text parsing. OpenAI, Claude, Grok, and Gemini are all supported.

How It Works

ToolRegistry::to_tool_definitions() converts all registered tools into a Vec<ToolDefinition> (JSON Schema format) that the provider understands.
These definitions are passed to send_message via the tools: Option<Vec<ToolDefinition>> parameter (replacing the old provider-specific grok/openai tool params).
The provider returns structured NativeToolCall objects instead of plain text markers.
A text-parsing fallback remains active for providers or models that do not support native function calling, ensuring backward compatibility.

New Types

Type	Description
`ToolDefinition`	JSON Schema description of a tool (name, description, parameters)
`NativeToolCall`	A structured tool invocation returned by the provider (id, name, arguments)
`Role::Tool { call_id }`	Conversation role for tool result messages, carrying the originating call id

Example

use std::sync::Arc;
use cloudllm::Agent;
use cloudllm::clients::openai::{OpenAIClient, Model};
use cloudllm::tool_protocols::CustomToolProtocol;
use cloudllm::tool_protocol::{ToolMetadata, ToolParameter, ToolParameterType, ToolRegistry, ToolResult};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let api_key = std::env::var("OPEN_AI_SECRET")?;

    // 1. Register tools with parameter schemas so the provider understands them
    let protocol = Arc::new(CustomToolProtocol::new());
    protocol.register_tool(
        ToolMetadata::new("add", "Add two numbers and return the sum")
            .with_parameter(ToolParameter::new("a", ToolParameterType::Number)
                .with_description("First number").required())
            .with_parameter(ToolParameter::new("b", ToolParameterType::Number)
                .with_description("Second number").required()),
        Arc::new(|params| {
            let a = params["a"].as_f64().unwrap_or(0.0);
            let b = params["b"].as_f64().unwrap_or(0.0);
            Ok(ToolResult::success(serde_json::json!({ "result": a + b })))
        }),
    ).await;

    // 2. Build registry and attach it to the agent
    let mut registry = ToolRegistry::new(protocol);
    registry.discover_tools_from_primary().await?;

    // to_tool_definitions() is a synchronous method — no .await needed
    let defs = registry.to_tool_definitions();
    println!("{} tool(s) will be sent to the provider as JSON Schema", defs.len());

    // 3. Agent.send() calls registry.to_tool_definitions() automatically and passes
    //    the resulting Vec<ToolDefinition> to send_message().  The provider returns
    //    structured NativeToolCall objects; the agent executes them and feeds results
    //    back as Role::Tool { call_id } messages — all without manual wiring.
    let mut agent = Agent::new(
        "calculator",
        "Calculator Agent",
        Arc::new(OpenAIClient::new_with_model_enum(&api_key, Model::GPT41Mini)),
    )
    .with_tools(registry);

    let response = agent.send("What is 123 multiplied by 456?").await?;
    println!("{}", response.content);
    Ok(())
}

What the agent loop does automatically:

Calls registry.to_tool_definitions() to build the JSON Schema tools array.
Passes the definitions to send_message(messages, Some(tool_defs)).
Checks response.tool_calls — if the provider returned a NativeToolCall, executes the matching tool and injects the result as a Role::Tool { call_id } message.
Calls the LLM again with the updated history until the provider returns a final text response (no more tool calls).
Falls back to text-parsing ({"tool_call": {...}} in the response body) for any provider or model that does not support native function calling.

Deploying Tool Servers with MCPServerBuilder

Create standalone MCP servers exposing tools over HTTP. Perfect for microservices, integration testing, or sharing tools across your infrastructure:

use std::sync::Arc;
use cloudllm::mcp_server::MCPServerBuilder;
use cloudllm::tool_protocols::CustomToolProtocol;
use cloudllm::tool_protocol::{ToolMetadata, ToolResult};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let protocol = Arc::new(CustomToolProtocol::new());

    // Register tools
    protocol.register_tool(
        ToolMetadata::new("calculator", "Evaluate math expressions"),
        Arc::new(|params| {
            let expr = params["expr"].as_str().unwrap_or("0");
            Ok(ToolResult::success(serde_json::json!({"result": 42.0})))
        }),
    ).await;

    // Deploy with security options
    MCPServerBuilder::new()
        .with_protocol("tools", protocol)
        .with_port(8080)
        .with_localhost_only()  // Only accept localhost
        .with_bearer_token("your-secret-token")  // Optional auth
        .build_and_serve()
        .await?;

    Ok(())
}

Available on the mcp-server feature. Other agents connect via McpClientProtocol::new("http://localhost:8080").

Creating Tools: Simple to Advanced

CloudLLM provides a powerful, protocol-agnostic tool system that works seamlessly with agents and orchestrations. Tools enable agents to take actions beyond conversation—calculate values, query databases, call APIs, or maintain state across sessions.

Simple Tool Creation: Rust Closures

Register Rust functions or closures as tools. Add ToolParameter descriptions to unlock native function-calling on all providers (OpenAI, Claude, Grok, Gemini):

use std::sync::Arc;
use cloudllm::tool_protocols::CustomToolProtocol;
use cloudllm::tool_protocol::{ToolMetadata, ToolParameter, ToolParameterType, ToolResult};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let protocol = Arc::new(CustomToolProtocol::new());

    // Synchronous tool — describe parameters for native function calling
    protocol.register_tool(
        ToolMetadata::new("add", "Add two numbers and return the sum")
            .with_parameter(
                ToolParameter::new("a", ToolParameterType::Number)
                    .with_description("First number").required()
            )
            .with_parameter(
                ToolParameter::new("b", ToolParameterType::Number)
                    .with_description("Second number").required()
            ),
        Arc::new(|params| {
            let a = params["a"].as_f64().unwrap_or(0.0);
            let b = params["b"].as_f64().unwrap_or(0.0);
            Ok(ToolResult::success(serde_json::json!({"result": a + b})))
        }),
    ).await;

    // Asynchronous tool
    protocol.register_async_tool(
        ToolMetadata::new("fetch_url", "Fetch the content of a URL")
            .with_parameter(
                ToolParameter::new("url", ToolParameterType::String)
                    .with_description("The URL to fetch").required()
            ),
        Arc::new(|params| {
            Box::pin(async {
                let url = params["url"].as_str().unwrap_or("");
                Ok(ToolResult::success(serde_json::json!({"url": url, "status": "ok"})))
            })
        }),
    ).await;

    Ok(())
}

Tip: Always add .with_description() to parameters. The JSON Schema the provider receives is built directly from ToolParameter — richer descriptions improve tool-selection accuracy and reduce hallucinated parameter names.

Advanced Tool Creation: Custom Protocol Implementation

For complex tools or external system integration, implement the ToolProtocol trait:

use async_trait::async_trait;
use cloudllm::tool_protocol::{ToolMetadata, ToolProtocol, ToolResult};
use std::error::Error;

pub struct DatabaseAdapter;

#[async_trait]
impl ToolProtocol for DatabaseAdapter {
    async fn execute(
        &self,
        tool_name: &str,
        parameters: serde_json::Value,
    ) -> Result<ToolResult, Box<dyn Error + Send + Sync>> {
        match tool_name {
            "query" => {
                let sql = parameters["sql"].as_str().unwrap_or("");
                // Execute actual database query
                Ok(ToolResult::success(serde_json::json!({"result": "data"})))
            }
            _ => Ok(ToolResult::failure("Unknown tool".to_string()))
        }
    }

    async fn list_tools(&self) -> Result<Vec<ToolMetadata>, Box<dyn Error + Send + Sync>> {
        Ok(vec![ToolMetadata::new("query", "Execute SQL query")])
    }

    async fn get_tool_metadata(
        &self,
        tool_name: &str,
    ) -> Result<ToolMetadata, Box<dyn Error + Send + Sync>> {
        Ok(ToolMetadata::new(tool_name, "Database query tool"))
    }

    fn protocol_name(&self) -> &str {
        "database"
    }
}

Using Tools with Agents

Agents use tools through a registry. Since v0.11.1, tool schemas are automatically converted to ToolDefinition (JSON Schema) and forwarded to the provider's native function-calling API — no manual wiring required. Add ToolParameter descriptions so the LLM knows how to call each tool correctly:

use std::sync::Arc;
use cloudllm::Agent;
use cloudllm::clients::openai::{OpenAIClient, Model};
use cloudllm::tool_protocols::CustomToolProtocol;
use cloudllm::tool_protocol::{ToolMetadata, ToolParameter, ToolParameterType, ToolRegistry, ToolResult};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Describe parameters so the provider builds a valid function call
    let protocol = Arc::new(CustomToolProtocol::new());
    protocol.register_tool(
        ToolMetadata::new("add", "Add two numbers and return the sum")
            .with_parameter(
                ToolParameter::new("a", ToolParameterType::Number)
                    .with_description("First operand").required()
            )
            .with_parameter(
                ToolParameter::new("b", ToolParameterType::Number)
                    .with_description("Second operand").required()
            ),
        Arc::new(|params| {
            let a = params["a"].as_f64().unwrap_or(0.0);
            let b = params["b"].as_f64().unwrap_or(0.0);
            Ok(ToolResult::success(serde_json::json!({"result": a + b})))
        }),
    ).await;

    let mut registry = ToolRegistry::new(protocol);
    registry.discover_tools_from_primary().await?;

    // Attach registry — agent.send() automatically uses native tool calling
    let mut agent = Agent::new(
        "calculator",
        "Calculator Agent",
        Arc::new(OpenAIClient::new_with_model_enum(
            &std::env::var("OPEN_AI_SECRET")?,
            Model::GPT41Mini
        )),
    )
    .with_expertise("Mathematical calculations")
    .with_tools(registry);

    let result = agent.send("What is 17 plus 29?").await?;
    println!("{}", result.content); // "The sum of 17 and 29 is 46."
    Ok(())
}

Protocol Implementations

1. CustomToolProtocol (Local Rust Functions)

Register local Rust closures or async functions as tools. Covered above under "Simple Tool Creation".

2. McpClientProtocol (Remote MCP Servers)

Connect to remote MCP servers:

use std::sync::Arc;
use cloudllm::tool_protocols::McpClientProtocol;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Connect to an MCP server
    let protocol = Arc::new(McpClientProtocol::new("http://localhost:8080".to_string()));

    // List available tools from the MCP server
    let tools = protocol.list_tools().await?;
    println!("Available tools: {}", tools.len());

    Ok(())
}

3. MemoryProtocol (Persistent Agent State)

For maintaining state across sessions within a single process:

use std::sync::Arc;
use cloudllm::tools::Memory;
use cloudllm::tool_protocols::MemoryProtocol;
use cloudllm::tool_protocol::ToolRegistry;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create shared memory for persistence
    let memory = Arc::new(Memory::new());
    let protocol = Arc::new(MemoryProtocol::new(memory));
    let registry = ToolRegistry::new(protocol);

    // Execute memory operations
    let result = registry.execute_tool(
        "memory",
        serde_json::json!({"command": "P task_name ImportantTask 3600"}),
    ).await?;

    println!("Stored: {}", result.output);
    Ok(())
}

Built-in Tools

CloudLLM includes several production-ready tools that agents can use directly:

Calculator Tool

A fast, reliable scientific calculator for mathematical operations and statistical analysis. Perfect for agents that need to perform computations.

Features:

Comprehensive arithmetic operations (+, -, *, /, ^, %)
Trigonometric functions (sin, cos, tan, csc, sec, cot, asin, acos, atan)
Hyperbolic functions (sinh, cosh, tanh, csch, sech, coth)
Logarithmic and exponential functions (ln, log, log2, exp)
Statistical operations (mean, median, mode, std, stdpop, var, varpop, sum, count, min, max)
Mathematical constants (pi, e)

Usage Example:

use cloudllm::tools::Calculator;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let calc = Calculator::new();

    // Arithmetic — evaluate() returns Result<f64, CalculatorError>
    let result = calc.evaluate("2 + 2 * 3").await?;
    println!("{result}"); // 8

    // Trigonometry (radians)
    let sin_val = calc.evaluate("sin(pi/2)").await?;
    println!("{sin_val}"); // 1

    // Statistical functions
    let mean = calc.evaluate("mean([1, 2, 3, 4, 5])").await?;
    println!("{mean}"); // 3

    Ok(())
}

More Examples:

sqrt(16) -> 4.0
log(100) -> 2.0 (base 10)
std([1, 2, 3, 4, 5]) -> 1.581 (sample standard deviation)
floor(3.7) -> 3.0

For comprehensive documentation, see Calculator API docs.

Memory Tool

A persistent, TTL-aware key-value store for maintaining agent state across sessions. Perfect for single agents to track progress or multi-agent orchestrations to coordinate decisions.

Features:

Key-value storage with optional TTL (time-to-live) expiration
Automatic background expiration of stale entries (1-second cleanup)
Metadata tracking (creation timestamp, expiration time)
Succinct protocol for LLM communication (token-efficient)
Thread-safe shared access across agents
Designed specifically for agent communication (not a general database)

Basic Usage Example:

use cloudllm::tools::Memory;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let memory = Memory::new();

    // Store data with 1-hour TTL
    memory.put("research_progress".to_string(), "Found 3 relevant papers".to_string(), Some(3600));

    // Retrieve data
    if let Some((value, metadata)) = memory.get("research_progress", true) {
        println!("Progress: {}", value);
        println!("Stored at: {:?}", metadata.unwrap().added_utc);
    }

    // List all stored keys
    let keys = memory.list_keys();
    println!("Active memories: {:?}", keys);

    // Store without expiration (permanent)
    memory.put("important_decision".to_string(), "Use approach A".to_string(), None);

    // Delete specific memory
    memory.delete("research_progress");

    // Clear all memories
    memory.clear();

    Ok(())
}

Using with Agents via Tool Protocol:

use std::sync::Arc;
use cloudllm::tools::Memory;
use cloudllm::tool_protocols::MemoryProtocol;
use cloudllm::tool_protocol::ToolRegistry;
use cloudllm::Agent;
use cloudllm::clients::openai::{OpenAIClient, Model};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create shared memory for agents
    let memory = Arc::new(Memory::new());

    // Wrap with protocol for agent usage
    let protocol = Arc::new(MemoryProtocol::new(memory.clone()));
    let registry = ToolRegistry::new(protocol);

    // Create agent with memory access
    let agent = Agent::new(
        "researcher",
        "Research Agent",
        Arc::new(OpenAIClient::new_with_model_enum(
            &std::env::var("OPEN_AI_SECRET")?,
            Model::GPT41Mini
        )),
    )
    .with_tools(registry);

    // Agent can now use memory via commands like:
    // "P research_state Gathering data 7200"
    // "G research_state META"
    // "L"

    Ok(())
}

Memory Protocol Commands (for agents):

The Memory tool uses a token-efficient protocol designed for LLM communication:

Command	Syntax	Example	Use Case
Put	`P <key> <value> [ttl_seconds]`	`P task_status InProgress 3600`	Store state with 1-hour expiration
Get	`G <key> [META]`	`G task_status META`	Retrieve value + metadata
List	`L [META]`	`L META`	List all keys with metadata
Delete	`D <key>`	`D task_status`	Remove specific memory
Clear	`C`	`C`	Wipe all memories
Spec	`SPEC`	`SPEC`	Get protocol specification

Multi-Agent Memory Sharing:

use std::sync::Arc;
use cloudllm::clients::openai::{Model, OpenAIClient};
use cloudllm::tools::Memory;
use cloudllm::tool_protocols::MemoryProtocol;
use cloudllm::tool_protocol::ToolRegistry;
use cloudllm::{Agent, orchestration::{Orchestration, OrchestrationMode}};
use tokio::sync::RwLock;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let api_key = std::env::var("OPEN_AI_SECRET")?;
    let make_client = || Arc::new(OpenAIClient::new_with_model_enum(&api_key, Model::GPT41Mini));

    // Create shared memory — all agents read and write the same store
    let shared_memory = Arc::new(Memory::new());
    let protocol = Arc::new(MemoryProtocol::new(shared_memory));
    let shared_registry = Arc::new(RwLock::new(ToolRegistry::new(protocol)));

    // Both agents share the same registry; mutations are immediately visible to both
    let agent1 = Agent::new("researcher-a", "Researcher A", make_client())
        .with_shared_tools(shared_registry.clone());

    let agent2 = Agent::new("researcher-b", "Researcher B", make_client())
        .with_shared_tools(shared_registry.clone());

    let mut orchestration = Orchestration::new("research", "Collaborative Research");
    orchestration.add_agent(agent1)?;
    orchestration.add_agent(agent2)?;

    // Agents coordinate via memory:
    // 1. Agent A stores findings:  P research_findings "Found 5 papers" 7200
    // 2. Agent B retrieves them:   G research_findings META
    // 3. Either agent lists state: L

    Ok(())
}

For comprehensive documentation and patterns, see Memory API docs.

HTTP Client Tool

A secure REST API client for calling external services with domain allowlist/blocklist protection. Perfect for agents that need to make HTTP requests to external APIs.

Features:

All HTTP methods (GET, POST, PUT, DELETE, PATCH, HEAD)
Domain security with allowlist/blocklist (blocklist takes precedence)
Basic authentication and bearer token support
Custom headers and query parameters with automatic URL encoding
JSON response parsing
Configurable request timeout and response size limits
Thread-safe with connection pooling
Builder pattern for chainable configuration

Usage Example:

use cloudllm::tools::HttpClient;
use std::time::Duration;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut client = HttpClient::new();

    // Security: only allow api.example.com
    client.allow_domain("api.example.com");

    // Configuration via builder pattern
    client
        .with_header("Authorization", "Bearer token123")
        .with_query_param("format", "json")
        .with_timeout(Duration::from_secs(30));

    // Make request
    let response = client.get("https://api.example.com/data").await?;

    // Check status and parse JSON
    if response.is_success() {
        let json_data = response.json()?;
        println!("Data: {}", json_data);
    }

    Ok(())
}

Security Best Practices:

Domain Allowlist: client.allow_domain("api.trusted-service.com")
Deny Malicious Domains: client.deny_domain("malicious.attacker.com")
Timeout Protection: client.with_timeout(Duration::from_secs(30))
Size Limits: client.with_max_response_size(10 * 1024 * 1024) (10MB)
Authentication: client.with_basic_auth("user", "pass") or client.with_header("Authorization", "Bearer token")

For comprehensive documentation, see HttpClient API docs and examples/http_client_example.rs.

Bash Tool

Secure command execution on Linux and macOS with timeout and security controls. See BashTool API docs.

File System Tool

Safe file and directory operations with path traversal protection and optional extension filtering. Perfect for agents that need to read, write, and manage files within designated directories.

Key Features:

Read, write, append, and delete files
Directory creation, listing, and recursive deletion
File metadata retrieval (size, modification time, is_directory)
File search with pattern matching
Path traversal prevention (../../../etc/passwd is blocked)
Optional file extension filtering for security
Root path restriction for sandboxing

Basic Usage:

use cloudllm::tools::FileSystemTool;
use std::path::PathBuf;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create tool with root path restriction
    let fs = FileSystemTool::new()
        .with_root_path(PathBuf::from("/home/user/documents"))
        .with_allowed_extensions(vec!["txt".to_string(), "md".to_string()]);

    // Write a file
    fs.write_file("notes.txt", "Important information").await?;

    // Read a file
    let content = fs.read_file("notes.txt").await?;
    println!("{content}");

    // List directory contents
    let entries = fs.read_directory(".", false).await?;
    for entry in entries {
        println!("{}: {} bytes", entry.name, entry.size);
    }

    // Get metadata
    let metadata = fs.get_file_metadata("notes.txt").await?;
    println!("Size: {} bytes, Modified: {}", metadata.size, metadata.modified);

    Ok(())
}

For comprehensive documentation, see the FileSystemTool API docs and examples/filesystem_example.rs.

Creating Custom Protocol Adapters

Implement the ToolProtocol trait to support new protocols:

use async_trait::async_trait;
use cloudllm::tool_protocol::{ToolMetadata, ToolProtocol, ToolResult};
use std::error::Error;

/// Example: Custom protocol adapter for a hypothetical service
pub struct MyCustomAdapter {
    // Your implementation
}

#[async_trait]
impl ToolProtocol for MyCustomAdapter {
    async fn execute(
        &self,
        tool_name: &str,
        parameters: serde_json::Value,
    ) -> Result<ToolResult, Box<dyn Error + Send + Sync>> {
        // Implement tool execution logic
        Ok(ToolResult::success(serde_json::json!({})))
    }

    async fn list_tools(&self) -> Result<Vec<ToolMetadata>, Box<dyn Error + Send + Sync>> {
        // Return available tools
        Ok(vec![])
    }

    async fn get_tool_metadata(
        &self,
        tool_name: &str,
    ) -> Result<ToolMetadata, Box<dyn Error + Send + Sync>> {
        // Return specific tool metadata
        Ok(ToolMetadata::new(tool_name, "Tool description"))
    }

    fn protocol_name(&self) -> &str {
        "my-custom-protocol"
    }
}

Best Practices for Tools

Clear Names & Descriptions: Make tool purposes obvious to LLMs — names and descriptions are included verbatim in the JSON Schema sent to the provider.
Parameter Schemas: Always add ToolParameter entries with .with_description() and mark required parameters with .required(). The provider uses this schema to construct valid function calls; missing descriptions lead to hallucinated parameter names.
Type Accuracy: Use the most specific ToolParameterType — prefer Integer over Number when the argument must be whole, and Object/Array with nested properties for complex inputs.
Error Handling: Return ToolResult::failure("...") with a clear message — the agent feeds this back to the LLM so it can retry or explain the problem.
Atomicity: Each tool should do one thing well. Compose multi-step operations in the agent loop, not inside individual tools.
Testing: Test ToolProtocol::execute() in isolation (see tests/tool_integration_tests.rs for patterns) before wiring to an agent.
Discovery: Call registry.discover_tools_from_primary().await? after registering tools via ToolRegistry::new(protocol) to populate the registry's tool map.

For more examples, see the examples/ directory and run cargo doc --open for complete API documentation.

Image Generation

CloudLLM provides unified image generation across OpenAI, Grok, and Google Gemini. The new register_image_generation_tool() helper dramatically simplifies adding image generation capabilities to agents.

Quick Start: Image Generation Tool

Register an image generation tool with a single line:

use std::sync::Arc;
use cloudllm::Agent;
use cloudllm::clients::openai::{OpenAIClient, Model};
use cloudllm::cloudllm::image_generation::register_image_generation_tool;
use cloudllm::cloudllm::{ImageGenerationProvider, new_image_generation_client};
use cloudllm::tool_protocols::CustomToolProtocol;
use cloudllm::tool_protocol::ToolRegistry;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let api_key = std::env::var("OPEN_AI_SECRET")?;

    // Create image generation client (choose provider: OpenAI, Grok, or Gemini)
    let image_client = new_image_generation_client(
        ImageGenerationProvider::OpenAI,
        &api_key,
    )?;

    // Create a tool protocol
    let protocol = Arc::new(CustomToolProtocol::new());

    // Register the image generation tool (much simpler than manual implementation!)
    let rt = tokio::runtime::Runtime::new()?;
    rt.block_on(register_image_generation_tool(&protocol, image_client.clone()))?;

    // Create agent with image generation capability
    let registry = ToolRegistry::new(protocol);

    let agent = Agent::new(
        "designer",
        "Creative Designer",
        Arc::new(OpenAIClient::new_with_model_enum(&api_key, Model::GPT41Mini)),
    )
    .with_tools(registry)
    .with_expertise("Creating visual content")
    .with_personality("Creative and detailed");

    println!("Agent created with image generation capability");
    Ok(())
}

Supported Providers

Provider	Model	Supported Ratios
OpenAI (DALL-E 3)	`gpt-image-1.5`	1:1, 16:9, 4:3, 3:2, 9:16, 3:4, 2:3
Grok Imagine	`grok-imagine-image`	1:1, 16:9, 4:3, 3:2, 9:16, 3:4, 2:3, and more
Google Gemini	`gemini-2.5-flash-image`	1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9

Using Different Providers

use cloudllm::cloudllm::{ImageGenerationProvider, new_image_generation_client};

// OpenAI (realistic, high-quality)
let client = new_image_generation_client(
    ImageGenerationProvider::OpenAI,
    &std::env::var("OPEN_AI_SECRET")?,
)?;

// Grok (fast, creative)
let client = new_image_generation_client(
    ImageGenerationProvider::Grok,
    &std::env::var("XAI_KEY")?,
)?;

// Gemini (flexible aspect ratios)
let client = new_image_generation_client(
    ImageGenerationProvider::Gemini,
    &std::env::var("GEMINI_API_KEY")?,
)?;

Parsing from Strings with FromStr

For dynamic provider selection from strings, use the FromStr trait:

use cloudllm::cloudllm::{ImageGenerationProvider, new_image_generation_client};
use std::str::FromStr;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let provider_name = "grok";  // From config, user input, etc.

    // Parse string to enum using FromStr trait
    let provider = ImageGenerationProvider::from_str(provider_name)?;

    // Create client with parsed provider
    let client = new_image_generation_client(
        provider,
        &std::env::var("XAI_KEY")?,
    )?;

    println!("Using provider: {}", provider.display_name());
    Ok(())
}

Supported provider strings (case-insensitive):

"openai" -> OpenAI (DALL-E 3)
"grok" -> Grok Imagine
"gemini" -> Google Gemini

For comprehensive documentation, see the image_generation module docs.

Examples

Clone the repository and run the provided examples:

export OPEN_AI_SECRET=...
export ANTHROPIC_KEY=...
export GEMINI_KEY=...
export XAI_KEY=...

cargo run --example interactive_session
cargo run --example streaming_session
cargo run --example orchestration_demo
cargo run --example breakout_game_ralph

Each example corresponds to a module in the documentation so you can cross-reference the code with explanations.

Support & contributing

Issues and pull requests are welcome via GitHub. Please open focused pull requests against main and include tests or doc updates where relevant.

CloudLLM is released under the MIT License.

Happy orchestration!

Name		Name	Last commit message	Last commit date
Latest commit History 373 Commits
.claude/agents		.claude/agents
.grok		.grok
examples		examples
mcp		mcp
src		src
tests		tests
.gitignore		.gitignore
.plan		.plan
Cargo.toml		Cargo.toml
DOCUMENTATION_AUDIT.md		DOCUMENTATION_AUDIT.md
HANDOFF.md		HANDOFF.md
LICENSE		LICENSE
MEMORY.md		MEMORY.md
MENTISDB_MCP.md		MENTISDB_MCP.md
MENTISDB_REST.md		MENTISDB_REST.md
Makefile		Makefile
ORCHESTRATION_TUTORIAL.md		ORCHESTRATION_TUTORIAL.md
README.md		README.md
agent_scary_clown.png		agent_scary_clown.png
changelog.txt		changelog.txt
logo.png		logo.png
openai_generation_test.png		openai_generation_test.png
tetris_planner_output.html		tetris_planner_output.html

Folders and files

Latest commit

History

Repository files navigation

CloudLLM

Table of Contents

Installation

Quick Start

1. LLMSession — stateful conversation (OpenAI)

2. Agent — identity + tools (Claude)

3. Streaming tokens in real time (Grok)

Multi-Agent Orchestration

Orchestration Modes

Basic Example: RoundRobin

AnthropicAgentTeams: Decentralized Task Coordination

Ralph: Autonomous PRD-Driven Loop

Provider wrappers

LLMSession: Stateful Conversations (The Foundation)

Agents: Building Intelligent Workers with Tools

MentisDB: Persistent Agent Memory

Context Strategies: Managing Context Window Exhaustion

Agent::fork() — Lightweight Copies for Parallel Execution

Runtime Tool Hot-Swapping

Event System: Real-Time Agent & Orchestration Observability

EventHandler Trait

AgentEvent Variants

OrchestrationEvent Variants

Registering an Event Handler

Full Example: Real-Time Progress Display

Tool Registry: Multi-Protocol Tool Access

Adding Tools to a Registry

Registry Modes

Native Tool Calling (v0.11.1)

How It Works

New Types

Example

Deploying Tool Servers with MCPServerBuilder

Creating Tools: Simple to Advanced

Simple Tool Creation: Rust Closures

Advanced Tool Creation: Custom Protocol Implementation

Using Tools with Agents

Protocol Implementations

1. CustomToolProtocol (Local Rust Functions)

2. McpClientProtocol (Remote MCP Servers)

3. MemoryProtocol (Persistent Agent State)

Built-in Tools

Calculator Tool

Memory Tool

HTTP Client Tool

Bash Tool

File System Tool

Creating Custom Protocol Adapters

Best Practices for Tools

Image Generation

Quick Start: Image Generation Tool

Supported Providers

Using Different Providers

Parsing from Strings with FromStr

Examples

Support & contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages