The tool surface every agentic AI framework reimplements, but done once and done right.
bash / read / write / edit / glob / grep / http behind a
single Tool trait, with JSON Schema generation, streaming output,
cancellation, and four reference agent loops wiring it into OpenAI,
Anthropic, Google Gemini, and OpenRouter. Drop it into your harness and
focus on the loop, auth, retry, and sandboxing that's actually your
product.
If you're building an agentic harness in Rust — your own Claude Code, your own codex, a custom copilot, an eval runner, a sandboxed notebook — you will reimplement the same seven tools that everyone else has implemented, and you will get the same details subtly wrong:
- Bash. Naive implementations spawn a fresh
tokio::process::Commandper call and lose everycd, everyexport, every shell function the model sets up. A correct one opens a PTY, keeps one bash alive across calls, streams stdout line-by-line as it arrives, handles cancellation via Ctrl-C mid-command without killing the session, and survivesset -mjob-control shenanigans. - Tool definitions. Each LLM provider wants a different shape. OpenAI
Chat Completions nests under
function. OpenAI Responses API is flat. Anthropic calls itinput_schema. Gemini Code Assist wraps everything infunctionDeclarationsand wants OpenAPI-3 schema types. Keeping four flavors of tool metadata in sync by hand is not fun. - Streaming, cancellation, typed errors. None of these are free. A
ProgressSinktrait, aCancellationTokenthreaded into every call, and aToolErrorenum that separates protocol-level failures (cancelled, timeout, invalid args) from execution errors takes more design work than most ad-hoc agent code ever gets.
opentools is the result of getting all of that right once so your
harness doesn't have to.
Three primitives: a Tool trait, a ToolRegistry, and a ToolContext.
#[async_trait]
pub trait Tool: Send + Sync + 'static {
type Input: DeserializeOwned + JsonSchema + Send + Sync;
type Output: Serialize + Send + Sync;
fn name(&self) -> &'static str;
fn description(&self) -> &'static str;
async fn execute(&self, input: Self::Input, ctx: ToolContext)
-> Result<Self::Output, ToolError>;
}Input/output are typed, not serde_json::Value. You write strongly-
typed structs, schemars derives the JSON Schema, and a blanket impl
erases the associated types into an object-safe DynTool so a
ToolRegistry can hold heterogeneous tools behind Arc<dyn DynTool>.
Tool— one trait, every tool.ToolRegistry— register by value, dispatch by name with a JSON payload:reg.call("bash", json!({"command": "ls"}), ctx).await?.reg.list()returns{name, description, input_schema}entries ready to convert into any provider's tool-call format.ToolContext— per-callcwd,CancellationToken, andArc<dyn ProgressSink>for streaming. Nothing about providers, loops, auth, or retry — you bring those.
The typical integration is ~50 lines of glue. Here's the skeleton.
use opentools::{ToolContext, ToolRegistry};
use opentools::tools::{Bash, Edit, Glob, Grep, Http, Read, Write};
let bash = Bash::spawn().await?; // PTY spawned here
let mut reg = ToolRegistry::new();
reg.register(bash)
.register(Read)
.register(Write)
.register(Edit)
.register(Glob)
.register(Grep)
.register(Http::new());
// Add your own tools alongside the built-ins.
// reg.register(SqlQuery::new(pool));
// reg.register(SlackPost::new(webhook));Every provider has a different shape. Here are the four you need to know — these are lifted straight from the demos and are known to work.
OpenAI Chat Completions / OpenRouter / most OpenAI-compatible
endpoints — nested under function:
let tools: Vec<Value> = reg.list().into_iter().map(|t| json!({
"type": "function",
"function": {
"name": t.name,
"description": t.description,
"parameters": t.input_schema,
}
})).collect();OpenAI Responses API (codex endpoint / gpt-5 family) — flat:
let tools: Vec<Value> = reg.list().into_iter().map(|t| json!({
"type": "function",
"name": t.name,
"description": t.description,
"parameters": t.input_schema,
"strict": false,
})).collect();Anthropic Messages API — no type wrapper, uses input_schema:
let tools: Vec<Value> = reg.list().into_iter().map(|t| json!({
"name": t.name,
"description": t.description,
"input_schema": t.input_schema,
})).collect();Google Gemini (Vertex / Code Assist) — wrap in functionDeclarations,
and the schema needs light sanitization (strip $schema, title,
format; flatten type: ["X","null"] → type: "X" + nullable: true).
demo_gemini.rs has a 25-line sanitize_schema
helper you can copy verbatim:
let declarations: Vec<Value> = reg.list().into_iter().map(|t| {
let mut params = t.input_schema;
sanitize_schema_for_gemini(&mut params);
json!({
"name": t.name,
"description": t.description,
"parameters": params,
})
}).collect();
let tools = json!([{ "functionDeclarations": declarations }]);When the model emits a tool call, you get a name and a JSON arguments string. Parse and dispatch through the registry:
let ctx = ToolContext::default();
// ... in your agent loop, after parsing the model's tool_call:
let args: Value = serde_json::from_str(&tool_call_arguments_string)?;
let result = reg.call(&tool_call_name, args, ctx.clone()).await;
let content = match result {
Ok(v) => serde_json::to_string(&v)?,
Err(e) => json!({"error": e.to_string()}).to_string(),
};
// Feed `content` back to the model on the next turn as a tool_result /
// function_call_output / functionResponse (whatever your provider calls it).Anything not in the registry errors with ToolError::Execution("unknown tool: ..."). There's no way for the model to bypass the registered set
— the protocol physically prevents it.
Attach a ProgressSink to the context. bash (and any custom tool you
write) will emit events as they happen:
use opentools::{ProgressSink, ToolEvent};
use std::sync::Arc;
struct TerminalSink;
impl ProgressSink for TerminalSink {
fn send(&self, event: ToolEvent) {
match event {
ToolEvent::StdoutLine(line) => println!("│ {line}"),
ToolEvent::StderrLine(line) => eprintln!("│ {line}"),
ToolEvent::Progress { message, .. } => println!("… {message}"),
}
}
}
let ctx = ToolContext::default().with_progress(Arc::new(TerminalSink));Now reg.call("bash", ...) streams bash output through TerminalSink
as commands produce it, not all at once at the end. The sink is behind
Arc<dyn ProgressSink>, so the same context can be cloned across tool
calls in an agent loop. NoopProgress is provided for "I don't care
about streaming" contexts.
Thread a CancellationToken through and cancel it when the user hits
Ctrl-C (or a higher-level deadline fires, or whatever):
use tokio_util::sync::CancellationToken;
let cancel = CancellationToken::new();
let ctx = ToolContext::default().with_cancel(cancel.clone());
// Spawn a handler that cancels on Ctrl-C
tokio::spawn({
let cancel = cancel.clone();
async move {
let _ = tokio::signal::ctrl_c().await;
cancel.cancel();
}
});
// Long-running bash command will return ToolError::Cancelled promptly
reg.call("bash", json!({"command": "sleep 600"}), ctx).awaitThe bash tool sends SIGINT to the current foreground process when the
token fires, waits for the sentinel (up to 2 seconds), and returns
Cancelled. The bash session itself stays alive for the next call
(set -m enables job control so only the foreground command is killed).
Implementing Tool is three structs and a trait impl. schemars derives
the JSON Schema from your Input type automatically, including field
descriptions from /// doc comments.
use opentools::{Tool, ToolContext, ToolError};
use async_trait::async_trait;
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
#[derive(Debug, Deserialize, JsonSchema)]
pub struct SqlQueryInput {
/// The SELECT statement to run. Must be read-only.
pub query: String,
/// Maximum number of rows to return.
#[serde(default)]
pub limit: Option<u32>,
}
#[derive(Debug, Serialize, JsonSchema)]
pub struct SqlQueryOutput {
pub rows: Vec<serde_json::Value>,
pub row_count: usize,
}
pub struct SqlQuery {
pool: sqlx::PgPool,
}
#[async_trait]
impl Tool for SqlQuery {
type Input = SqlQueryInput;
type Output = SqlQueryOutput;
fn name(&self) -> &'static str { "sql_query" }
fn description(&self) -> &'static str {
"Run a read-only SELECT query against the production replica. \
Use this when the user asks for data that's stored in Postgres. \
Returns at most `limit` rows as JSON objects."
}
async fn execute(&self, input: SqlQueryInput, _ctx: ToolContext)
-> Result<SqlQueryOutput, ToolError>
{
let rows = sqlx::query(&input.query)
.fetch_all(&self.pool)
.await
.map_err(|e| ToolError::Execution(e.to_string()))?;
Ok(SqlQueryOutput {
row_count: rows.len(),
rows: rows.into_iter().map(row_to_json).collect(),
})
}
}Now just reg.register(SqlQuery { pool }) and it shows up in
reg.list() alongside the built-ins, with a derived schema the model
can use.
| Tool | Description |
|---|---|
bash |
Persistent PTY-backed bash session. cd, env vars, aliases, shell functions persist across calls. Streams stdout line-by-line, cancellable via CancellationToken, per-call timeout, session survives SIGINT'd commands via set -m. |
read |
Read a UTF-8 file with optional offset / limit. |
write |
Write a file, mkdir -p'ing parent directories as needed. |
edit |
Exact-string find/replace with a unique-match guard (opt-in replace_all). |
glob |
**/*.rs-style pattern walker that honors .gitignore. |
grep |
Regex search with path + line number + matching text. |
http |
HTTP request via reqwest. Any method (GET/POST/PUT/PATCH/DELETE/HEAD/OPTIONS), custom headers, optional body. Returns status, final URL after redirects, response headers, content-type, and body (capped at 1 MiB by default, truncated at a UTF-8 boundary). headers_only: true skips the body entirely for cheap probes (like curl -I). |
examples/ has four full working agent loops, ~200–300 LOC
each. They're templates you can copy-paste when building your own harness —
each shows the complete integration for one provider.
| Demo | Provider | Subscription source | API key env var | Default model |
|---|---|---|---|---|
demo_openai |
OpenAI Responses API | ~/.codex/auth.json |
OPENAI_API_KEY |
gpt-5.4 |
demo_claude |
Anthropic Messages API | macOS Keychain / ~/.claude/.credentials.json |
ANTHROPIC_API_KEY |
claude-opus-4-6 |
demo_gemini |
Google Code Assist | ~/.gemini/oauth_creds.json |
GEMINI_API_KEY |
gemini-3-flash-preview |
demo_openrouter |
OpenRouter Chat Completions | (API key only) | OPENROUTER_API_KEY |
openai/gpt-5 |
Each demo supports:
# Auto-detect — use subscription creds if present, else env var API key
cargo run --example demo_claude
# Force an auth mode
cargo run --example demo_openai -- --auth sub
cargo run --example demo_gemini -- --auth api
# Override the default model
cargo run --example demo_claude -- --model claude-sonnet-4-5
# Remove tools from the registry so the model has to work harder
cargo run --example demo_openai -- --exclude-tool bash
cargo run --example demo_claude -- --exclude-tool bash --exclude-tool httpA typical run prints the registered tools, each tool call with args, the
streamed bash output through a PrintingSink, and the final model
summary:
opentools v0.1.0 registered: bash, edit, glob, grep, http, read, write
model: claude-opus-4-6 (via Keychain → api.anthropic.com (OAuth))
🚀 Task: Create a new directory at /tmp/demo-project-claude, initialize a git
repository inside it, and write a README.md...
│ I'll accomplish this with parallel operations where possible.
🔧 bash {"command":"mkdir -p /tmp/demo-project-claude && cd ... && git init"}
┆ Initialized empty Git repository in /private/tmp/demo-project-claude/.git/
↳ {"exit_code":0, "output":"Initialized empty Git repository..."}
🔧 write {"path":"/tmp/demo-project-claude/README.md","content":"# Demo Project..."}
↳ {"bytes_written":185}
│ Created /tmp/demo-project-claude, initialized a git repository, and wrote a README.md.
✓ Done
Dispatch always goes through ToolRegistry::call(name, args, ctx) — no
demo ever invokes a provider-native tool. The protocol physically
prevents it, which is the whole point.
Three layers — see TESTING.md for the full coverage matrices.
# 1. Unit + integration tests (no network, ~3s, 17 tests)
cargo test
# 2. Preflight e2e tests for each demo — flag parsing, auth errors,
# fake credentials, etc. (no network, no creds, ~5s)
bash tests/e2e/run.sh # 39 tests on macOS, 44 in Docker
# 3. Full e2e against real APIs with your real credentials
# (costs provider quota, mounts creds into a Docker sandbox)
bash tests/e2e/docker_full.sh # all providers
bash tests/e2e/docker_full.sh --only claude # just one (save quota)Integration tests exercise the PTY bash path end-to-end — cwd
persistence, env persistence, streaming, timeout, cancellation, session
survival after SIGINT — plus round-trip tests for the file tools.
- Not affiliated with OpenAI, Anthropic, or Google. The subscription
auth paths reuse credentials from the respective first-party CLIs
(
codex,claude,gemini). If those projects change their credential storage formats or providers tighten server-side checks on allowed clients, the demos will break. For production harnesses, prefer API keys via--auth api. - No sandboxing. The
bashtool runs commands on your real filesystem as your real user. Wrap it in whatever sandbox your harness needs (bubblewrap, seatbelt, Docker, firejail). TheTooltrait makes no security claims — it's just a dispatch layer. - Gemini preview model gating.
gemini-3.1-pro-preview-customtoolsreturns 404 on most accounts; the default isgemini-3-flash-preview. See TESTING.md for the full model availability table. - No OAuth token refresh in the OpenAI and Claude demos — if your subscription token has expired, re-run the CLI's login command. The Gemini demo does auto-refresh because its tokens are shorter-lived.
Dual-licensed under MIT or Apache-2.0, at your option.