Skip to content

Initial Public Release

Latest

Choose a tag to compare

@niccanordhas niccanordhas released this 22 Feb 01:07
· 1 commit to main since this release
6c8c8bf

Release Notes


v1.6.0 — Initial Public Release

Released: 2025

Welcome to the first public release of tmam — an open-source, OpenTelemetry-native observability platform for LLM applications, AI agents, and GPU workloads. This release includes the complete platform: the Python SDK, the backend API server, the observability dashboard, and the Docker Compose self-hosting stack.


🚀 What's Included

Python SDK (pip install tmam)

The tmam Python SDK instruments your AI stack automatically. Drop two lines into your application and every LLM call, agent step, and vector database query is traced with zero code changes to your business logic.

from tmam import init
init(url="...", public_key="...", secrect_key="...", application_name="my-app")

The SDK is built on OpenTelemetry 1.31.1 and exports traces, metrics, and events via OTLP HTTP to the tmam server (or any OTLP-compatible backend).


Auto-Instrumentation — 40+ Providers

tmam auto-detects and instruments every supported library present in your environment:

LLM Providers (17)

Provider | Async -- | -- OpenAI | ✅ Anthropic | ✅ Cohere | ✅ Mistral | ✅ AWS Bedrock | — Google Vertex AI | ✅ Google AI Studio | — Azure AI Inference | ✅ Groq | ✅ Ollama | — vLLM | — Together AI | ✅ GPT4All | — LiteLLM | — Reka | ✅ PremAI | — AI21 | —

☁️ Cloud

Don't want to self-host? The managed cloud platform at cloud.tmam.ai provides the same full feature set with zero infrastructure to manage.


🔗 Resources

# Release Notes

v1.6.0 — Initial Public Release

Released: 2025

Welcome to the first public release of tmam — an open-source, OpenTelemetry-native observability platform for LLM applications, AI agents, and GPU workloads. This release includes the complete platform: the Python SDK, the backend API server, the observability dashboard, and the Docker Compose self-hosting stack.


🚀 What's Included

Python SDK (pip install tmam)

The tmam Python SDK instruments your AI stack automatically. Drop two lines into your application and every LLM call, agent step, and vector database query is traced with zero code changes to your business logic.

from tmam import init
init(url="...", public_key="...", secrect_key="...", application_name="my-app")

The SDK is built on OpenTelemetry 1.31.1 and exports traces, metrics, and events via OTLP HTTP to the tmam server (or any OTLP-compatible backend).


Auto-Instrumentation — 40+ Providers

tmam auto-detects and instruments every supported library present in your environment:

LLM Providers (17)

Provider Async
OpenAI
Anthropic
Cohere
Mistral
AWS Bedrock
Google Vertex AI
Google AI Studio
Azure AI Inference
Groq
Ollama
vLLM
Together AI
GPT4All
LiteLLM
Reka
PremAI
AI21

Agent Frameworks (14)

LangChain · LlamaIndex · CrewAI · AG2 (AutoGen) · Haystack · Phidata · Dynamiq · ControlFlow · Julep · Mem0 · EmbedChain · MultiOn · Letta · OpenAI Agents SDK

Vector Databases (5)

Chroma · Pinecone · Qdrant · Milvus · Astra DB

Specialized (4)

ElevenLabs · AssemblyAI · HuggingFace Transformers · Crawl4AI · FireCrawl

Every instrumented call automatically captures: model name, provider, input/output token counts, estimated cost, request duration, time-to-first-token (streaming), time-between-tokens (streaming), finish reason, response ID, prompt text, and completion text.


GPU Monitoring

Enable with collect_gpu_stats=True. Supports NVIDIA (via pynvml) and AMD (via amdsmi) GPUs.

Metrics collected per GPU: utilization, encoder/decoder utilization, temperature, fan speed, memory (available / used / free / total), power draw, and power limit — all tagged with GPU index, UUID, and name.


Manual Tracing

For code not covered by auto-instrumentation, two primitives are available:

  • @trace decorator — wraps any sync or async function in a span, captures args and return value
  • start_trace() context manager — creates a named span with full attribute control via the TracedSpan API

Both support role classification ("agent", "tool", "llm", "memory", "embedding", "vectordb", "framework") for proper display in the trace hierarchy.


Prompt Hub

Fetch versioned, published prompts from the tmam server at runtime:

from tmam import get_prompt
prompt = get_prompt(name="support-agent", version=3)

Supports fetching by name, prompt ID, version number, and label. Template variable compilation ({{variable}}) is handled server-side.


Vault — Encrypted Secret Retrieval

Fetch encrypted secrets stored in the tmam Vault and optionally inject them directly into environment variables:

from tmam import get_secrets
get_secrets(should_set_env=True)  # inject all vault secrets into os.environ

Built-In Evaluations

The tmam.evals module provides LLM-as-judge evaluators that run against any text:

  • Hallucination — detects factual inaccuracies, contradictions, and nonsensical responses against provided context
  • ToxicityDetector — identifies harmful, abusive, or offensive language
  • BiasDetector — detects gender, racial, political, and other biases
  • All — runs all three simultaneously

Supports any LLM provider (OpenAI, Anthropic, Cohere, Mistral, Gemini, Grok) as the judge. Configurable detection threshold (0.0–1.0).


Guardrails — Runtime Input/Output Safety

The Detect class checks user inputs or model outputs against configured guardrail rules before or after LLM calls:

from tmam import Detect
result = Detect().input(text="user message", guardrail_id="...")
# { "verdict": "yes"|"no", "score": 0.95, "guard": "Prompt Injection", ... }

Detection types: LLM-Based (contextual) and Regex-Based (pattern matching)

Guard types: All · Prompt Injection · Sensitive Topics · Topic Restriction

19 detection categories including impersonation, obfuscation, SQL injection, simple instruction override, few-shot manipulation, hypothetical scenario, and more.


Dataset & Experiment Framework

Create structured test datasets and run experiments with custom task functions and evaluator functions. Results are tracked in the dashboard for comparison across runs:

result = dataset.run_experiment(
    name="gpt-4o-run",
    task=my_task_fn,
    evaluators=[my_scorer],
    max_concurrency=10,
)

🖥️ Platform — API Server (v1.0.0)

The tmam server is a Node.js/TypeScript application built with Express 4.21, MongoDB 6.14, and protobufjs 7.4 for OTLP ingestion.

Authentication

  • Email & password signup and sign-in with bcrypt password hashing (configurable salt rounds)
  • Google Sign-In via Google OAuth 2.0 (google-auth-library 9.15)
  • Email confirmation — new accounts require email verification before access
  • Forgot password — time-limited reset link via email (24-hour expiry)
  • Session management — RSA-signed JWTs (RS256) cross-validated against the database on every request; logout immediately invalidates the token server-side

Email Notifications (SendGrid)

All transactional emails are sent via SendGrid (@sendgrid/mail 8.1):

Event Email
New registration Email confirmation link
Forgot password Password reset link (24h expiry)
Team member invitation Invite with role and organization name

Email sending is gracefully skipped if SENDGRID_API_KEY is not configured.

Security

  • RSA-signed JWT tokens with server-side token registry (instant revocation on logout)
  • bcrypt-hashed API secret keys — plaintext never stored; verified at runtime with bcrypt.compare
  • AES-192-CBC encryption for vault secrets (IV-based, scrypt-derived key)
  • AES-256-CBC PBE encryption for internal use (PBKDF2-derived key, 10,000 iterations)
  • Role-based access control enforced via stacked middleware on every protected endpoint
  • Helmet HTTP security headers on all routes
  • CORS with configurable allow-list

Organizations & Teams

  • Create and manage multiple organizations from a single account
  • Create projects within organizations (the observability namespace for API keys and data)
  • Invite members by email with role assignment; invitation email sent automatically
  • Full membership lifecycle: invited → pending → approved/rejected → active/left
  • Role management: admins can update member roles at any time
  • Approve/reject pending join requests from the Notifications panel
  • Permission enforcement: limited-access members cannot modify org-level settings or manage other members

Observations & Tracing

  • OTLP protobuf ingestion for traces, metrics, and events via /api/sdk/*
  • Stores full OpenTelemetry span data: trace ID, span ID, parent span ID, span kind, service name, resource attributes, scope, span attributes, duration (nanoseconds), status code, and status message
  • Separate views for Requests (flat list), Tracing (hierarchical waterfall), and Exceptions (error spans)
  • Rich filtering: time range, operation type, status code, model, provider, environment, application name
  • Supports both histogram and sum metric types from OTel metrics pipeline

Analytics

  • LLM Analytics: token usage over time, cost by application, top models by usage, model usage trends, request volume
  • GPU Analytics: per-GPU utilization, memory, temperature, power draw
  • Vector DB Analytics: operations over time, latency, breakdown by collection and system
  • Guardrail Analytics: detection rate, category breakdowns, per-application metrics
  • Scores Analytics: evaluation score trends and distributions

Prompt Management

  • Versioned prompt storage with draft and published states
  • Supports text and chat prompt types (with roles: system, user, assistant, developer, placeholder)
  • Version history with labels, tags, and custom meta properties
  • Prompt Hub SDK endpoint for runtime fetching (/api/sdk/prompt/compiled)
  • OpenGround — live multi-model comparison playground supporting: OpenAI, Anthropic, Cohere, Mistral, Gemini, Grok, Llama, Deepseek

Guardrails

  • Create named guardrail rules with configurable detection type, guard type, threshold, valid/invalid topics, and custom regex rules
  • Supports a default guardrail assignment per organization/project
  • SDK endpoint for real-time input and output detection (/api/sdk/guardrail/detect)
  • Evaluation templates with scoring and reasoning fields for AI Arbiter use

Vault

  • Encrypted secret storage (AES-192-CBC at rest)
  • Key names are normalized to UPPER_SNAKE_CASE automatically
  • SDK endpoint for secret retrieval with optional environment variable injection
  • Scoped to organization and project

API Keys

  • Per-organization, per-project SDK key pairs (public key + bcrypt-hashed secret)
  • Keys used for all SDK authentication (X-Public-Key / X-Secret-Key headers)
  • Management UI in dashboard Settings

🌐 Dashboard (v0.1.0)

The tmam dashboard is a Next.js 15 application using the App Router, React 19, TailwindCSS, Radix UI, and Recharts.

Pages included:

Section Pages
Auth Login, Register, Confirm Email, Forgot Password
Observations Requests, Tracing, Exceptions (with detail views)
Analytics LLM, GPU, Vector DB, Models, Guardrails, Scores
Prompt Management Prompts (versioned), OpenGround (comparison playground)
Evaluation Datasets, Guardrails, AI Arbiter
Settings API Keys, Vault, Organizations, Models, Templates
Account Profile, Notifications, Feedback, Support

🐳 Docker Compose

Self-host the complete platform with a single command:

docker compose up --build

Services: tmam-server (port 5050) · tmam-web (port 3001) · tmam-mongo (port 27018)

Nginx reverse proxy configuration included (commented out by default) for TLS termination in production.


📦 Dependencies

Python SDK

Package Version
opentelemetry-api ^1.31.1
opentelemetry-sdk ^1.31.1
opentelemetry-exporter-otlp ^1.31.1
opentelemetry-instrumentation ^0.52b1
openai ^1.68.2
anthropic ^0.49.0
requests ^2.32.3
pydantic ^2.10.6

Server

Package Version
express ^4.21.2
mongodb ^6.14.2
jsonwebtoken ^9.0.2
bcryptjs ^3.0.3
protobufjs ^7.4.0
@sendgrid/mail ^8.1.5
helmet ^8.0.0
openai ^4.89.0
@anthropic-ai/sdk ^0.40.0
@google/genai ^1.17.0
@mistralai/mistralai ^1.6.0
google-auth-library ^9.15.1

Dashboard

Package Version
next ^15.5.6
react ^19.0.0
next-auth ^4.24.11
@mui/material ^7.0.2
recharts ^2.15.2
@tanstack/react-table ^8.21.2
zod ^3.24.2
zustand ^5.0.3

☁️ Cloud

Don't want to self-host? The managed cloud platform at [cloud.tmam.ai](https://cloud.tmam.ai) provides the same full feature set with zero infrastructure to manage.


🔗 Resources