Release Notes
v1.6.0 — Initial Public Release
Released: 2025
Welcome to the first public release of tmam — an open-source, OpenTelemetry-native observability platform for LLM applications, AI agents, and GPU workloads. This release includes the complete platform: the Python SDK, the backend API server, the observability dashboard, and the Docker Compose self-hosting stack.
🚀 What's Included
Python SDK (pip install tmam)
The tmam Python SDK instruments your AI stack automatically. Drop two lines into your application and every LLM call, agent step, and vector database query is traced with zero code changes to your business logic.
from tmam import init
init(url="...", public_key="...", secrect_key="...", application_name="my-app")
The SDK is built on OpenTelemetry 1.31.1 and exports traces, metrics, and events via OTLP HTTP to the tmam server (or any OTLP-compatible backend).
Auto-Instrumentation — 40+ Providers
tmam auto-detects and instruments every supported library present in your environment:
LLM Providers (17)
Provider | Async -- | -- OpenAI | ✅ Anthropic | ✅ Cohere | ✅ Mistral | ✅ AWS Bedrock | — Google Vertex AI | ✅ Google AI Studio | — Azure AI Inference | ✅ Groq | ✅ Ollama | — vLLM | — Together AI | ✅ GPT4All | — LiteLLM | — Reka | ✅ PremAI | — AI21 | —☁️ Cloud
Don't want to self-host? The managed cloud platform at cloud.tmam.ai provides the same full feature set with zero infrastructure to manage.
🔗 Resources
- Repository: github.com/tmam-dev/tmam
- Documentation: docs.tmam.ai
- Cloud: cloud.tmam.ai
- Python SDK (PyPI):
pip install tmam - License: Apache 2.0
v1.6.0 — Initial Public Release
Released: 2025
Welcome to the first public release of tmam — an open-source, OpenTelemetry-native observability platform for LLM applications, AI agents, and GPU workloads. This release includes the complete platform: the Python SDK, the backend API server, the observability dashboard, and the Docker Compose self-hosting stack.
🚀 What's Included
Python SDK (pip install tmam)
The tmam Python SDK instruments your AI stack automatically. Drop two lines into your application and every LLM call, agent step, and vector database query is traced with zero code changes to your business logic.
from tmam import init
init(url="...", public_key="...", secrect_key="...", application_name="my-app")The SDK is built on OpenTelemetry 1.31.1 and exports traces, metrics, and events via OTLP HTTP to the tmam server (or any OTLP-compatible backend).
Auto-Instrumentation — 40+ Providers
tmam auto-detects and instruments every supported library present in your environment:
LLM Providers (17)
| Provider | Async |
|---|---|
| OpenAI | ✅ |
| Anthropic | ✅ |
| Cohere | ✅ |
| Mistral | ✅ |
| AWS Bedrock | — |
| Google Vertex AI | ✅ |
| Google AI Studio | — |
| Azure AI Inference | ✅ |
| Groq | ✅ |
| Ollama | — |
| vLLM | — |
| Together AI | ✅ |
| GPT4All | — |
| LiteLLM | — |
| Reka | ✅ |
| PremAI | — |
| AI21 | — |
Agent Frameworks (14)
LangChain · LlamaIndex · CrewAI · AG2 (AutoGen) · Haystack · Phidata · Dynamiq · ControlFlow · Julep · Mem0 · EmbedChain · MultiOn · Letta · OpenAI Agents SDK
Vector Databases (5)
Chroma · Pinecone · Qdrant · Milvus · Astra DB
Specialized (4)
ElevenLabs · AssemblyAI · HuggingFace Transformers · Crawl4AI · FireCrawl
Every instrumented call automatically captures: model name, provider, input/output token counts, estimated cost, request duration, time-to-first-token (streaming), time-between-tokens (streaming), finish reason, response ID, prompt text, and completion text.
GPU Monitoring
Enable with collect_gpu_stats=True. Supports NVIDIA (via pynvml) and AMD (via amdsmi) GPUs.
Metrics collected per GPU: utilization, encoder/decoder utilization, temperature, fan speed, memory (available / used / free / total), power draw, and power limit — all tagged with GPU index, UUID, and name.
Manual Tracing
For code not covered by auto-instrumentation, two primitives are available:
@tracedecorator — wraps any sync or async function in a span, captures args and return valuestart_trace()context manager — creates a named span with full attribute control via theTracedSpanAPI
Both support role classification ("agent", "tool", "llm", "memory", "embedding", "vectordb", "framework") for proper display in the trace hierarchy.
Prompt Hub
Fetch versioned, published prompts from the tmam server at runtime:
from tmam import get_prompt
prompt = get_prompt(name="support-agent", version=3)Supports fetching by name, prompt ID, version number, and label. Template variable compilation ({{variable}}) is handled server-side.
Vault — Encrypted Secret Retrieval
Fetch encrypted secrets stored in the tmam Vault and optionally inject them directly into environment variables:
from tmam import get_secrets
get_secrets(should_set_env=True) # inject all vault secrets into os.environBuilt-In Evaluations
The tmam.evals module provides LLM-as-judge evaluators that run against any text:
Hallucination— detects factual inaccuracies, contradictions, and nonsensical responses against provided contextToxicityDetector— identifies harmful, abusive, or offensive languageBiasDetector— detects gender, racial, political, and other biasesAll— runs all three simultaneously
Supports any LLM provider (OpenAI, Anthropic, Cohere, Mistral, Gemini, Grok) as the judge. Configurable detection threshold (0.0–1.0).
Guardrails — Runtime Input/Output Safety
The Detect class checks user inputs or model outputs against configured guardrail rules before or after LLM calls:
from tmam import Detect
result = Detect().input(text="user message", guardrail_id="...")
# { "verdict": "yes"|"no", "score": 0.95, "guard": "Prompt Injection", ... }Detection types: LLM-Based (contextual) and Regex-Based (pattern matching)
Guard types: All · Prompt Injection · Sensitive Topics · Topic Restriction
19 detection categories including impersonation, obfuscation, SQL injection, simple instruction override, few-shot manipulation, hypothetical scenario, and more.
Dataset & Experiment Framework
Create structured test datasets and run experiments with custom task functions and evaluator functions. Results are tracked in the dashboard for comparison across runs:
result = dataset.run_experiment(
name="gpt-4o-run",
task=my_task_fn,
evaluators=[my_scorer],
max_concurrency=10,
)🖥️ Platform — API Server (v1.0.0)
The tmam server is a Node.js/TypeScript application built with Express 4.21, MongoDB 6.14, and protobufjs 7.4 for OTLP ingestion.
Authentication
- Email & password signup and sign-in with bcrypt password hashing (configurable salt rounds)
- Google Sign-In via Google OAuth 2.0 (
google-auth-library 9.15) - Email confirmation — new accounts require email verification before access
- Forgot password — time-limited reset link via email (24-hour expiry)
- Session management — RSA-signed JWTs (RS256) cross-validated against the database on every request; logout immediately invalidates the token server-side
Email Notifications (SendGrid)
All transactional emails are sent via SendGrid (@sendgrid/mail 8.1):
| Event | |
|---|---|
| New registration | Email confirmation link |
| Forgot password | Password reset link (24h expiry) |
| Team member invitation | Invite with role and organization name |
Email sending is gracefully skipped if SENDGRID_API_KEY is not configured.
Security
- RSA-signed JWT tokens with server-side token registry (instant revocation on logout)
- bcrypt-hashed API secret keys — plaintext never stored; verified at runtime with
bcrypt.compare - AES-192-CBC encryption for vault secrets (IV-based, scrypt-derived key)
- AES-256-CBC PBE encryption for internal use (PBKDF2-derived key, 10,000 iterations)
- Role-based access control enforced via stacked middleware on every protected endpoint
- Helmet HTTP security headers on all routes
- CORS with configurable allow-list
Organizations & Teams
- Create and manage multiple organizations from a single account
- Create projects within organizations (the observability namespace for API keys and data)
- Invite members by email with role assignment; invitation email sent automatically
- Full membership lifecycle: invited → pending → approved/rejected → active/left
- Role management: admins can update member roles at any time
- Approve/reject pending join requests from the Notifications panel
- Permission enforcement: limited-access members cannot modify org-level settings or manage other members
Observations & Tracing
- OTLP protobuf ingestion for traces, metrics, and events via
/api/sdk/* - Stores full OpenTelemetry span data: trace ID, span ID, parent span ID, span kind, service name, resource attributes, scope, span attributes, duration (nanoseconds), status code, and status message
- Separate views for Requests (flat list), Tracing (hierarchical waterfall), and Exceptions (error spans)
- Rich filtering: time range, operation type, status code, model, provider, environment, application name
- Supports both histogram and sum metric types from OTel metrics pipeline
Analytics
- LLM Analytics: token usage over time, cost by application, top models by usage, model usage trends, request volume
- GPU Analytics: per-GPU utilization, memory, temperature, power draw
- Vector DB Analytics: operations over time, latency, breakdown by collection and system
- Guardrail Analytics: detection rate, category breakdowns, per-application metrics
- Scores Analytics: evaluation score trends and distributions
Prompt Management
- Versioned prompt storage with draft and published states
- Supports text and chat prompt types (with roles: system, user, assistant, developer, placeholder)
- Version history with labels, tags, and custom meta properties
- Prompt Hub SDK endpoint for runtime fetching (
/api/sdk/prompt/compiled) - OpenGround — live multi-model comparison playground supporting: OpenAI, Anthropic, Cohere, Mistral, Gemini, Grok, Llama, Deepseek
Guardrails
- Create named guardrail rules with configurable detection type, guard type, threshold, valid/invalid topics, and custom regex rules
- Supports a default guardrail assignment per organization/project
- SDK endpoint for real-time input and output detection (
/api/sdk/guardrail/detect) - Evaluation templates with scoring and reasoning fields for AI Arbiter use
Vault
- Encrypted secret storage (AES-192-CBC at rest)
- Key names are normalized to
UPPER_SNAKE_CASEautomatically - SDK endpoint for secret retrieval with optional environment variable injection
- Scoped to organization and project
API Keys
- Per-organization, per-project SDK key pairs (public key + bcrypt-hashed secret)
- Keys used for all SDK authentication (
X-Public-Key/X-Secret-Keyheaders) - Management UI in dashboard Settings
🌐 Dashboard (v0.1.0)
The tmam dashboard is a Next.js 15 application using the App Router, React 19, TailwindCSS, Radix UI, and Recharts.
Pages included:
| Section | Pages |
|---|---|
| Auth | Login, Register, Confirm Email, Forgot Password |
| Observations | Requests, Tracing, Exceptions (with detail views) |
| Analytics | LLM, GPU, Vector DB, Models, Guardrails, Scores |
| Prompt Management | Prompts (versioned), OpenGround (comparison playground) |
| Evaluation | Datasets, Guardrails, AI Arbiter |
| Settings | API Keys, Vault, Organizations, Models, Templates |
| Account | Profile, Notifications, Feedback, Support |
🐳 Docker Compose
Self-host the complete platform with a single command:
docker compose up --buildServices: tmam-server (port 5050) · tmam-web (port 3001) · tmam-mongo (port 27018)
Nginx reverse proxy configuration included (commented out by default) for TLS termination in production.
📦 Dependencies
Python SDK
| Package | Version |
|---|---|
opentelemetry-api |
^1.31.1 |
opentelemetry-sdk |
^1.31.1 |
opentelemetry-exporter-otlp |
^1.31.1 |
opentelemetry-instrumentation |
^0.52b1 |
openai |
^1.68.2 |
anthropic |
^0.49.0 |
requests |
^2.32.3 |
pydantic |
^2.10.6 |
Server
| Package | Version |
|---|---|
express |
^4.21.2 |
mongodb |
^6.14.2 |
jsonwebtoken |
^9.0.2 |
bcryptjs |
^3.0.3 |
protobufjs |
^7.4.0 |
@sendgrid/mail |
^8.1.5 |
helmet |
^8.0.0 |
openai |
^4.89.0 |
@anthropic-ai/sdk |
^0.40.0 |
@google/genai |
^1.17.0 |
@mistralai/mistralai |
^1.6.0 |
google-auth-library |
^9.15.1 |
Dashboard
| Package | Version |
|---|---|
next |
^15.5.6 |
react |
^19.0.0 |
next-auth |
^4.24.11 |
@mui/material |
^7.0.2 |
recharts |
^2.15.2 |
@tanstack/react-table |
^8.21.2 |
zod |
^3.24.2 |
zustand |
^5.0.3 |
☁️ Cloud
Don't want to self-host? The managed cloud platform at [cloud.tmam.ai](https://cloud.tmam.ai) provides the same full feature set with zero infrastructure to manage.
🔗 Resources
- Repository: [github.com/tmam-dev/tmam](https://github.com/tmam-dev/tmam)
- Documentation: [docs.tmam.ai](https://docs.tmam.ai)
- Cloud: [cloud.tmam.ai](https://cloud.tmam.ai)
- Python SDK (PyPI):
pip install tmam - License: Apache 2.0