← Back to README | Prerequisites & Setup | Architecture Overview | Aspire Dashboard variant →
This step-by-step guide walks you through adding OpenTelemetry (OTEL) distributed tracing to the Children's Story Studio backend and viewing traces directly inside VS Code using the AI Toolkit extension. No Docker, no external dashboards — everything stays in your editor.
Prefer a browser-based dashboard? The Aspire Dashboard variant of this guide uses the .NET Aspire Dashboard (Docker) as the trace viewer instead.
Note: Unlike the Activity Page Agents and TTS guides, this guide does not use GitHub Copilot to generate the implementation. Every code change is provided directly — you'll copy the code, understand what it does, and wire it in yourself.
- What You'll Build
- Why AI Toolkit?
- Before You Start
- Step 1: Configure AI Toolkit for Tracing
- Step 2: Add OTEL Python Packages
- Step 3: Add OTEL Settings
- Step 4: Create the Telemetry Module
- Step 5: Wire Telemetry into the App
- Step 6: Generate a Story and View Traces
- What to Look For
- Troubleshooting
After completing this guide, every story generation request will emit distributed traces that flow through the entire multi-agent workflow:
POST /api/generate-story ← FastAPI auto-instrumented span
│
├── Workflow: run ← Agent Framework workflow span
│ ├── Executor: orchestrator ← Agent Framework executor spans
│ │ └── LLM: chat/completions
│ ├── Executor: story_architect
│ │ └── LLM: chat/completions
│ ├── Executor: art_director
│ │ ├── LLM: chat/completions
│ │ ├── Image: generate (page 1) ← Parallel image generation
│ │ ├── Image: generate (page 2)
│ │ └── ...
│ ├── Executor: story_reviewer
│ │ └── LLM: chat/completions
│ └── Executor: decision
│
└── Response streamed via SSE
You'll view these traces in the AI Toolkit's built-in trace viewer — right inside VS Code.
| Benefit | Description |
|---|---|
| No Docker required | Unlike the Aspire Dashboard, AI Toolkit runs entirely within VS Code — no containers to manage |
| Stay in your editor | View traces, inspect LLM calls, and debug agents without context-switching to a browser |
| Agent-aware trace view | AI Toolkit understands agent framework traces and presents them with agent-specific context |
| Prompt inspection | Click into any LLM call span to see the full prompt and response — ideal for prompt engineering |
| Zero additional setup for tracing | AI Toolkit includes a built-in OTLP receiver — just point your app at it |
NOTE: If you do not want to implement OTEL observability in the app on your own, feel free to experiment with prompting GitHub Copilot to follow the documentation in this file to perform the implementation.
Ensure you've completed all steps in Prerequisites & Setup and that the base application is working (Running the Demo).
Confirm that the AI Toolkit for VS Code extension is installed and enabled:
- Open the Extensions panel (
Cmd+Shift+X/Ctrl+Shift+X) - Search for "AI Toolkit"
- Verify that AI Toolkit for Visual Studio Code (by Microsoft) shows as installed and enabled
If not installed, click Install now.
git checkout main
git pull origin main
git checkout -b my-otel-observabilityAI Toolkit includes a built-in OTLP receiver that can accept OpenTelemetry traces from your application.
- Open the AI Toolkit panel in the VS Code sidebar (look for the AI Toolkit icon)
- Navigate to the Tracing section
- Start the OTLP trace receiver — AI Toolkit will display the endpoint URL and port it's listening on (typically
http://localhost:4317)
Note: Take note of the endpoint URL displayed by AI Toolkit. You'll use this in Step 3 when configuring the
OTEL_EXPORTER_OTLP_ENDPOINTenvironment variable. If AI Toolkit uses a different port than4317, adjust accordingly.
Good news:
agent-framework-corealready bundlesopentelemetry-api,opentelemetry-sdk, andopentelemetry-semantic-conventions-aias transitive dependencies. You only need to add the OTLP exporter (to send data to AI Toolkit) and the FastAPI instrumentor (to auto-trace HTTP requests).
Open backend/requirements.txt and append the following lines at the end of the file:
# OpenTelemetry (api + sdk are already included via agent-framework-core)
opentelemetry-exporter-otlp-proto-grpc>=1.28.0
opentelemetry-instrumentation-fastapi>=0.49b0Then install the new dependencies:
cd backend
source .venv/bin/activate
pip install -r requirements.txtOr run the Backend: Install Python deps VS Code task from the Command Palette.
What these packages do:
| Package | Purpose |
|---|---|
opentelemetry-exporter-otlp-proto-grpc |
Exports spans to any OTLP-compatible receiver (AI Toolkit, Aspire, Jaeger, etc.) over gRPC |
opentelemetry-instrumentation-fastapi |
Auto-instruments FastAPI — creates spans for every incoming HTTP request automatically |
Agent Framework uses a combination of standard OpenTelemetry environment variables and Agent Framework–specific environment variables to control observability. Add the following to your backend/.env file:
# ── Agent Framework Observability ──────────────────────────────────────────
# Activates Agent Framework's built-in instrumentation (spans for agent
# invocations, LLM chat calls, and tool executions).
ENABLE_INSTRUMENTATION=true
# Includes prompts, completions, function arguments and results in span
# attributes. See the WARNING below before enabling.
ENABLE_SENSITIVE_DATA=true
# ── Standard OpenTelemetry ─────────────────────────────────────────────────
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
OTEL_SERVICE_NAME=children-story-studioImportant: Verify the
OTEL_EXPORTER_OTLP_ENDPOINTmatches the endpoint shown by AI Toolkit's trace receiver (from Step 1). The default ishttp://localhost:4317.
⚠️ WARNING — Sensitive Data: WhenENABLE_SENSITIVE_DATA=true, Agent Framework records full prompt text, LLM responses, function call arguments, and function results as span attributes. This is extremely useful for debugging during development, but may expose personally identifiable information (PII), API keys embedded in prompts, or other confidential data in your trace viewer. Only enable this in development or test environments. SetENABLE_SENSITIVE_DATA=false(or remove the variable entirely) before deploying to any shared or production environment.
What these variables do:
| Variable | Default | Description |
|---|---|---|
ENABLE_INSTRUMENTATION |
false |
Activates Agent Framework's OpenTelemetry instrumentation code paths — without this, the framework will not emit invoke_agent, chat, or execute_tool spans |
ENABLE_SENSITIVE_DATA |
false |
When true, includes prompt/response content and function arguments in span attributes. Development only — see warning above |
OTEL_EXPORTER_OTLP_ENDPOINT |
(none) | The OTLP gRPC endpoint. Points to AI Toolkit's built-in receiver on port 4317 |
OTEL_SERVICE_NAME |
agent_framework |
The service name that appears in the trace viewer |
Next, open backend/app/config.py and add a single new field to the Settings class — a master switch that lets you disable OTEL entirely without removing environment variables:
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=False,
extra="ignore",
)
foundry_project_endpoint: str = ""
foundry_model_deployment_name: str = "gpt-4o"
foundry_image_model_deployment_name: str = "gpt-image-1"
# CORS origin for the React dev server
cors_origin: str = "http://localhost:5173"
# OpenTelemetry — master switch (set to False to disable without removing env vars)
otel_enabled: bool = True
settings = Settings()The only new field is otel_enabled. All other OTEL configuration is handled by the standard environment variables above, which Agent Framework's configure_otel_providers() reads automatically.
Create a new file at backend/app/telemetry.py with the following contents:
"""
telemetry.py — OpenTelemetry bootstrap for the story-generation backend.
Uses Agent Framework's built-in ``configure_otel_providers()`` to set up
the TracerProvider, exporters, and Agent Framework instrumentation from
environment variables. Also auto-instruments FastAPI so every incoming
HTTP request gets its own trace span automatically.
See: https://learn.microsoft.com/en-us/agent-framework/agents/observability
"""
import logging
from agent_framework.observability import configure_otel_providers
from fastapi import FastAPI
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from .config import settings
logger = logging.getLogger(__name__)
def configure_telemetry(app: FastAPI) -> None:
"""Set up OTEL providers via Agent Framework and instrument FastAPI.
This must be called **before** any agent-framework imports that create
workflows or executors, so the framework can pick up the active
TracerProvider and emit its own spans.
Does nothing if ``settings.otel_enabled`` is False.
"""
if not settings.otel_enabled:
logger.info("OpenTelemetry is disabled (OTEL_ENABLED=false)")
return
# 1. Configure OTEL providers — reads OTEL_EXPORTER_OTLP_ENDPOINT,
# OTEL_SERVICE_NAME, ENABLE_INSTRUMENTATION, and ENABLE_SENSITIVE_DATA
# from environment variables automatically.
configure_otel_providers()
# 2. Auto-instrument FastAPI (creates a parent span for every HTTP request).
# This is NOT covered by configure_otel_providers(), so we add it here.
FastAPIInstrumentor.instrument_app(app)
logger.info("OpenTelemetry configured via Agent Framework")What this does:
configure_otel_providers()— Agent Framework's built-in bootstrap function. It:- Reads
OTEL_EXPORTER_OTLP_ENDPOINTand creates an OTLP gRPC exporter targeting AI Toolkit's receiver - Creates a
TracerProvider(plus log and metric providers) withOTEL_SERVICE_NAMEas the service resource - Reads
ENABLE_INSTRUMENTATION— whentrue, activates the framework's instrumentation code paths so it emitsinvoke_agent,chat, andexecute_toolspans automatically - Reads
ENABLE_SENSITIVE_DATA— whentrue, includes prompt text, LLM responses, and function arguments/results as span attributes - Registers everything as the global OTEL providers
- Reads
FastAPIInstrumentor.instrument_app(app)— Wraps every FastAPI route handler to create a parent span for each HTTP request. This is separate from Agent Framework's instrumentation and must be added explicitly.
Now modify backend/app/main.py to call configure_telemetry() at startup.
Important — Import Ordering: The
story_workflowobject inworkflow.pyis a module-level singleton — it's created the momentworkflow.pyis imported. That import chain starts whenmain.pyimportsStoryGenerator(which importsworkflow.py). For Agent Framework to emit its own spans, the globalTracerProvidermust be active before the workflow is built. This means we need to configure telemetry before importingStoryGenerator.
Replace the contents of backend/app/main.py with:
"""
main.py — FastAPI application entry point.
Endpoints:
GET /api/health — health check
POST /api/generate-story — runs the story workflow; streams SSE progress events
"""
import logging
from dotenv import load_dotenv
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from sse_starlette.sse import EventSourceResponse
load_dotenv() # Agent Framework reads env vars directly — ensure .env is loaded early
from .config import settings # noqa: E402
from .models import StoryRequest # noqa: E402
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s %(levelname)-8s %(name)s — %(message)s",
)
logger = logging.getLogger(__name__)
# ─── App ──────────────────────────────────────────────────────────────────────
app = FastAPI(
title="Children's Story Multi-Agent API",
description=(
"Multi-agent orchestration for generating illustrated children's stories "
"using Microsoft Agent Framework."
),
version="1.0.0",
)
app.add_middleware(
CORSMiddleware,
allow_origins=[settings.cors_origin, "http://localhost:5173", "http://localhost:5174"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# ─── Telemetry (must be configured BEFORE importing StoryGenerator) ───────────
from .telemetry import configure_telemetry # noqa: E402
configure_telemetry(app)
# ─── Service instances ────────────────────────────────────────────────────────
from .story_generator import StoryGenerator # noqa: E402
_story_generator = StoryGenerator()
# ─── Health check ─────────────────────────────────────────────────────────────
@app.get("/api/health")
async def health() -> dict:
return {"status": "ok", "service": "children-story-multi-agent"}
# ─── Story generation (SSE) ───────────────────────────────────────────────────
@app.post("/api/generate-story")
async def generate_story(request: StoryRequest) -> EventSourceResponse:
"""Accepts story parameters and streams back SSE events as the multi-agent
workflow progresses. The final event (type: 'complete') contains the
full illustrated StoryResponse.
"""
return _story_generator.event_source_response(request)What changed (compared to the original main.py):
load_dotenv()was added near the top of the file. Agent Framework'sconfigure_otel_providers()reads environment variables directly (viaos.environ), not through Pydantic settings — so we must ensure the.envfile is loaded into the process environment before calling it.- The
from .story_generator import StoryGeneratorimport was moved down — it now appears afterconfigure_telemetry(app)is called. - Two new lines were added between the CORS middleware and the service instances:
from .telemetry import configure_telemetry configure_telemetry(app)
- The
# noqa: E402comments suppress the linter warning about imports not being at the top of the file. This is intentional — the import ordering is required for correct OTEL initialization.
Everything else — the endpoints, CORS config, logging — is unchanged.
Open the AI Toolkit panel in VS Code and ensure the trace receiver is running (from Step 1).
cd backend
source .venv/bin/activate
uvicorn app.main:app --reload --port 8000You should see a log line confirming telemetry is active:
OpenTelemetry configured via Agent Framework
cd frontend
npm run devOpen the app at http://localhost:5173, fill in the story form, and click Generate Story. Wait for the story to complete.
Switch back to VS Code and open the AI Toolkit panel. Navigate to the Tracing section — you should see the trace for the POST /api/generate-story request. Click on it to expand the trace waterfall and explore each agent's spans.
When examining a trace in AI Toolkit, look for these patterns:
The top-level span is the FastAPI HTTP request (POST /api/generate-story). Beneath it, you should see spans emitted by Agent Framework for the workflow and each agent. The framework uses OpenTelemetry GenAI Semantic Conventions for span naming:
| Span Name Pattern | What It Represents |
|---|---|
POST /api/generate-story |
The full HTTP request lifecycle (auto-instrumented by FastAPI) |
invoke_agent <agent_name> |
Each agent invocation — the top-level span for an agent's work within an executor |
chat <model_name> |
An LLM chat completion call. When ENABLE_SENSITIVE_DATA=true, the prompt and response text appear as span attributes |
execute_tool <function_name> |
A function tool execution (if your agents use tools). Includes arguments and results when sensitive data is enabled |
The trace view shows each executor's duration. You'll typically see:
- Orchestrator — Fast (single LLM call to create an outline)
- StoryArchitect — Moderate (one LLM call to write the full narrative)
- ArtDirector — Longest (LLM call for image prompts + parallel image generation calls)
- StoryReviewer — Moderate (one LLM call to review the draft)
- Decision — Near-instant (routing logic only)
One of AI Toolkit's strengths is the ability to click into any chat span and see the full prompt and response directly in VS Code. With ENABLE_SENSITIVE_DATA=true, you can:
- Inspect the system prompt sent to each agent
- Review the LLM response for each agent invocation
- See token counts (
gen_ai.usage.input_tokens/gen_ai.usage.output_tokens) for cost tracking - Compare prompt/response pairs across revision loops to see how the story evolves
This makes AI Toolkit particularly useful for prompt engineering — you can iterate on prompts in code, regenerate a story, and immediately inspect the results without leaving your editor.
If the StoryReviewer rejects a draft, you'll see the workflow loop back to Orchestrator. This appears as repeated executor spans — a second pass through Orchestrator → StoryArchitect → ArtDirector → StoryReviewer → Decision.
Inside the ArtDirector executor span, look for multiple image generation spans running concurrently (up to 5 in parallel, controlled by the semaphore in the existing code).
-
Check that the trace receiver is running — Open the AI Toolkit panel and verify the OTLP receiver is active and listening.
-
Check the backend logs — You should see
OpenTelemetry configured via Agent Framework. If you seeOpenTelemetry is disabled, check thatOTEL_ENABLEDis not set tofalsein your.env. -
Verify
load_dotenv()is called — Agent Framework'sconfigure_otel_providers()reads environment variables directly fromos.environ, not from Pydantic settings. Ifload_dotenv()is missing frommain.py, the.envvalues forOTEL_EXPORTER_OTLP_ENDPOINT,ENABLE_INSTRUMENTATION, etc. won't be available. -
Verify the OTLP endpoint matches — The
OTEL_EXPORTER_OTLP_ENDPOINTin your.envmust match the port AI Toolkit's receiver is listening on. The default ishttp://localhost:4317. -
Generate at least one story — Traces only appear after a request has been made. The health check endpoint (
GET /api/health) will also generate a span if you want a quick test.
-
ENABLE_INSTRUMENTATIONis not set totrue— Without this environment variable, Agent Framework will not emitinvoke_agent,chat, orexecute_toolspans even if aTracerProvideris active. Check yourbackend/.envfile. -
Import ordering — The
TracerProvidermust be registered as the global provider before thestory_workflowsingleton is created. Double-check that yourmain.pyfollows the import ordering from Step 5.
- Try the Aspire Dashboard variant to see how the same traces look in a browser-based waterfall view — useful for demos and screen sharing.
- Experiment with custom spans — add manual tracing to specific operations using
tracer.start_as_current_span(). - If you haven't already, try the Activity Page Agents guide or Text-to-Speech guide to extend the application with new capabilities.