Agents that ship. Not surprise you with bills.
Python library for AI agents with budget control, memory, and observability built-in.
Website Β· Docs Β· Discord Β· Twitter
# Install Syrin with OpenAI support (default)
pip install syrin
# Install with Anthropic support
pip install syrin[anthropic]
# Install with voice capabilities
pip install syrin[voice]You built an AI agent. It worked perfectly in testing. Then came the bill β a surprise invoice for thousands of dollars with zero warning.
This is the #1 reason AI agents never make it to production. Not because they don't work β because they're financially reckless.
What developers tell us:
"I had no idea when my agent hit the budget." "My logs don't show where tokens went." "I spent 3 weeks building memory from scratch." "My agent crashed after 2 hours β no way to resume." "I needed 8 libraries just to make one agent."
Syrin solves this. One library. Zero surprises. Production-ready from day one.
pip install syrinfrom syrin import Agent, Model, Budget, stop_on_exceeded
class Assistant(Agent):
model = Model.Almock() # No API key needed
budget = Budget(run=0.50, on_exceeded=stop_on_exceeded)
result = Assistant().response("Explain quantum computing simply")
print(result.content)
# Cost: $0.0012 | Budget used: $0.0012You now have:
- β Budget cap at $0.50 (stops automatically)
- β Cost tracking per response
- β Token usage breakdown
- β Full observability built-in
Syrin is built to solve the hard parts of building production AI agents. Hereβs how it handles specific challenges:
The Problem: Agents run out of context window or feed irrelevant history into the LLM. Syrin's Solution: Automatic token counting, window management, and dynamic context injection.
from syrin import Agent, Context
from syrin.threshold import ContextThreshold
agent = Agent(
model=Model.Almock(),
context=Context(
max_tokens=80000,
# Automatically compact when context is 75% full
thresholds=[
ContextThreshold(at=75, action=lambda ctx: ctx.compact()),
],
# Or proactively compact at 60% to prevent rot
auto_compact_at=0.6,
),
)Features:
- Token counting with model-specific encodings
- Compaction strategies (middle-out truncation, summarization)
- Dynamic injection for RAG or runtime data
- Snapshot view to debug exactly what the LLM sees
The Problem: Agents forget everything between sessions. Syrin's Solution: First-class persistent memory with 4 specialized types and decay curves.
from syrin import Agent
from syrin.memory import Memory
from syrin.enums import MemoryType
agent = Agent(
model=Model.Almock(),
memory=Memory(
types=[MemoryType.CORE, MemoryType.EPISODIC, MemoryType.SEMANTIC],
top_k=10, # Retrieve top 10 relevant memories
),
)
# Remember facts (persisted across sessions)
agent.remember("User prefers TypeScript", memory_type=MemoryType.CORE)
# Recall later (semantic search)
memories = agent.recall("user preferences")Memory Types:
- Core β Long-term facts (user profile, preferences)
- Episodic β Conversation history and events
- Semantic β Knowledge chunks with embeddings (RAG)
- Procedural β Skills and instructions
Backends: SQLite (default), Qdrant (vector search), Redis (cache), PostgreSQL (production).
The Problem: "What happened?" β no visibility into agent decisions. Syrin's Solution: Two ways to see everything: programmatic hooks and CLI tracing.
agent = Agent(
model=Model.Almock(),
debug=True, # Console output for every lifecycle event
)
# Or subscribe to specific events
agent.events.on("llm.request_start", lambda ctx: print(f"LLM call #{ctx.iteration}"))
agent.events.on("budget.threshold", lambda ctx: print(f"Budget at {ctx.percentage}%"))Run your agent script with the --trace flag for full observability without code changes:
# Enable full tracing
python my_agent.py --traceWhat you get:
- LLM request/response logs
- Tool execution traces
- Budget usage per call
- Memory operations (store/recall)
- Token counts and context utilization
The Problem: Agents run wild, you get surprise bills Syrin's Solution: Built-in budget control with automatic stops
# Per-run budget cap
agent = Agent(
model=Model.OpenAI("gpt-4o-mini", api_key="..."),
budget=Budget(run=0.50, on_exceeded=stop_on_exceeded),
)
# Budget thresholds (warn at 70%, switch model at 90%)
agent = Agent(
budget=Budget(
run=1.00,
thresholds=[
BudgetThreshold(at=70, action=lambda ctx: print("β οΈ 70% budget")),
BudgetThreshold(at=90, action=lambda ctx: ctx.parent.switch_model("gpt-4o-mini")),
],
),
)
# Rate limiting
agent = Agent(
budget=Budget(rate_limit=RateLimit(requests=10, window=60)), # 10 req/min
)Result: No surprise bills. Ever.
The Problem: Building multi-agent systems is complex Syrin's Solution: Simple primitives for powerful orchestration
from syrin import Agent, Model, DynamicPipeline
class Researcher(Agent):
model = Model.Almock()
system_prompt = "You research topics."
class Writer(Agent):
model = Model.Almock()
system_prompt = "You write reports."
# LLM decides which agents to spawn
pipeline = DynamicPipeline(agents=[Researcher, Writer], model=Model.Almock())
result = pipeline.run("Research AI trends and write a summary")
print(result.content, f"${result.cost:.4f}")
# Or manually:
researcher = Researcher()
result = researcher.handoff(Writer, "Write article from research", transfer_context=True)Multi-Agent Patterns:
- Handoff β Route to specialist agents
- Spawn β Create sub-agents for subtasks
- DynamicPipeline β LLM orchestrates agent selection
- Parallel execution β Run multiple agents simultaneously
The Problem: Agents produce harmful or incorrect output Syrin's Solution: Built-in guardrails with automatic blocking
from syrin import Agent, Model, GuardrailChain
from syrin.guardrails import LengthGuardrail, ContentFilter
class SafeAgent(Agent):
model = Model.Almock()
guardrails = GuardrailChain([
LengthGuardrail(max_length=4000),
ContentFilter(blocked_words=["spam", "malicious"]),
])
result = SafeAgent().response("User input")
print(result.report.guardrail.passed) # True/False
print(result.report.guardrail.blocked) # True if blockedGuardrail Types:
- Length β Max input/output length
- ContentFilter β Block harmful words
- PII Detection β Detect personal information
- Custom β Your validation logic
The Problem: "How do I serve this to users?" Syrin's Solution: One-line HTTP API + built-in playground
agent = Assistant()
agent.serve(port=8000, enable_playground=True, debug=True)
# Visit http://localhost:8000/playgroundFeatures:
- β
HTTP API (
POST /chat,POST /stream) - β Web playground (chat UI with cost display)
- β Real-time observability panel
- β Multi-agent support (agent selector)
- β MCP server integration
The Problem: Need to run custom logic at specific points Syrin's Solution: 72+ hooks for every lifecycle event
| Event | When It Fires |
|---|---|
LLM_REQUEST_START |
Before LLM call |
TOOL_CALL_START |
Before tool execution |
BUDGET_THRESHOLD |
Budget threshold reached |
CHECKPOINT_SAVED |
State saved |
CIRCUIT_TRIP |
Circuit breaker opens |
HANDOFF_START |
Agent hands off work |
SPAWN_START |
Sub-agent created |
| ... | 60+ more events |
The Problem: "I need to change agent config without redeploying" Syrin's Solution: Built-in remote configuration server
from syrin import Agent, configure
# Configure agent remotely
configure(
agent_id="my-agent",
endpoint="https://config.syrin.ai",
polling_interval=60, # Check for updates every 60 seconds
)
agent = Agent(model=Model.OpenAI("gpt-4o-mini"))
agent.serve(port=8000)Features:
- β Change config without redeploying
- β A/B testing support
- β Feature flags
- β Dynamic model switching
| Feature | Syrin | "Others" |
|---|---|---|
| Budget control | β Built-in, declarative | β DIY or missing |
| Cost tracking | β Every response | β Guesswork |
| Agent memory | β 4 types, auto-managed | β Manual setup |
| Observability | β 72+ hooks, full traces | β Add-on tools |
| Multi-agent | β Handoff, spawn, pipeline | β Complex orchestration |
| Type-safe | β StrEnum, mypy strict | β String hell |
| Production API | β One-line serve | β Build Flask wrapper |
| Remote config | β Built-in | β DIY |
| Circuit breaking | β Built-in | β External library |
| Checkpoints | β State persistence | β DIY |
ποΈ Voice AI Recruiter (examples/resume_agent)
A voice agent that handles recruiter calls using Syrin + Pipecat.
Features:
- Per-call budget limits ($0.50/call)
- Memory across conversations
- Real-time observability
- Cost tracking per call
Try it:
cd examples/resume_agent
python voice_server.pyProcesses financial reports with tool calling, memory, and budget constraints.
Multi-agent system that researches topics and writes reports with full cost control.
| Resource | Description |
|---|---|
| Getting Started | 5-minute guide to your first agent |
| Examples | Runnable code for every use case |
| API Reference | Complete API documentation |
| Architecture | How Syrin works under the hood |
| Budget Control | Deep dive into budget features |
| Memory | Memory systems and backends |
| Multi-Agent | Handoff, spawn, DynamicPipeline |
We're building the agent library we wish existed: production-ready, financially safe, and actually observable.
Every star tells us this matters. It helps us prioritize features and shows the community that agents don't have to be black boxes.
Star Syrin if you want:
- β Agents that don't surprise you with bills
- β One library instead of 10 glued together
- β Built-in observability (no more log scraping)
- β Memory that actually works
- β Multi-agent orchestration that's simple
We welcome contributions! See CONTRIBUTING.md for guidelines.
MIT License β see LICENSE for details.
Agents that ship. Not surprise you with bills.
