Skip to content

langchain-samples/langsmith-otel-redact

Repository files navigation

langsmith-otel-redact

Demonstrates using LangChain with OpenTelemetry to:

  1. Redact LLM outputs before forwarding traces to LangSmith
  2. Derive span metrics (latency, throughput, error rate) from traces
  3. Visualize those metrics in Grafana via Prometheus

Architecture

┌─────────────────────┐
│   Python App        │
│  (LangChain + OTel) │
└────────┬────────────┘
         │ OTLP/HTTP :4318
         ▼
┌───────────────────────────────────────────────────────┐
│                  OTel Collector                        │
│                                                       │
│  Processors:                                          │
│    batch → transform/mask_llm_output                  │
│    (replaces gen_ai.completion with REDACTED)         │
│                                                       │
│  Connectors:                                          │
│    spanmetrics → derives metrics from trace spans     │
│    (latency histograms, call counts, by model/op)     │
│                                                       │
│  Exporters:                                           │
│  ┌───────────┬──────────────┬──────────┬───────────┐  │
│  │  debug    │  file        │ otlphttp/│prometheus │  │
│  │           │  (collector. │ langsmith│ :8889     │  │
│  │           │   log)       │          │           │  │
│  └───────────┴──────────────┴──────────┴───────────┘  │
└──────────────────────┬─────────────────┬──────────────┘
                       │ OTLP/HTTP       │ /metrics
                       ▼                 ▼
            ┌──────────────────┐  ┌──────────────┐
            │  LangSmith       │  │  Prometheus   │
            │  (redacted       │  │  :9090        │
            │   traces)        │  └──────┬───────┘
            └──────────────────┘         │
                                         ▼
                                  ┌──────────────┐
                                  │  Grafana      │
                                  │  :3000        │
                                  └──────────────┘

Prerequisites

  • uv
  • Docker and Docker Compose
  • An OpenAI API key
  • A LangSmith API key

Setup

  1. Copy the .env file and fill in your API keys:
LANGSMITH_OTEL_ENABLED="true"
LANGSMITH_TRACING="true"
LANGSMITH_OTEL_ONLY="true"
LANGSMITH_API_KEY="<your-langsmith-api-key>"
LANGSMITH_PROJECT="<your-langsmith-project>"
OPENAI_API_KEY="<your-openai-api-key>"

The LANGSMITH_API_KEY variable is passed through to the OTel collector container via docker-compose, where it is used for the x-api-key header when exporting to LangSmith.

Running

Start the infrastructure

docker compose up

This starts:

  • OTel Collector — receives traces on :4318, exposes metrics on :8889
  • Prometheus — scrapes collector metrics, available at localhost:9090
  • Grafana — pre-configured dashboard, available at localhost:3000 (no login required)

Install dependencies

uv sync

Run the application

uv run main.py

This invokes a LangChain chain across multiple topics. Traces are sent via OTLP to the collector, which:

  1. Redacts the LLM output and forwards traces to LangSmith
  2. Derives span metrics (latency, call count) and exposes them to Prometheus

View the dashboard

Open localhost:3000 and navigate to the LangChain Span Metrics dashboard.

LangChain Span Metrics in Grafana

You'll see:

  • Request rate — calls per second over time
  • Latency percentiles — p50/p95/p99 latency by operation
  • Total calls & error rate — summary stats
  • Average latency by operation — bar gauge
  • Latency heatmap — distribution over time
  • Breakdowns by model and operation — pie charts

Run uv run main.py multiple times to generate more data points.

Span Metrics

The OTel collector's spanmetrics connector automatically derives the following metrics from trace spans:

Metric Description
langchain_calls_total Total number of span calls, labeled by operation and model
langchain_duration_milliseconds_* Latency histogram (bucket/sum/count) for each span

These are broken down by dimensions: gen_ai.system, gen_ai.request.model, and gen_ai.operation.name.

About

Redact sensitive trace data using OTEL

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages