feat(voice): realtime_langwatch_session context manager for live OpenAI Realtime apps (#673)#676
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds ChangesLive Tracing for OpenAI Realtime
Sequence DiagramsequenceDiagram
participant App as Live App
participant RLS as RealtimeLangWatchSession
participant OTel as TracerProvider
participant LW as LangWatch OTLP
App->>RLS: async with realtime_langwatch_session(name, model, api_key)
RLS->>OTel: detect existing concrete provider
alt no concrete provider and API key present
RLS->>OTel: install new TracerProvider + BatchSpanProcessor → LangWatch OTLP
else concrete provider already exists
RLS->>OTel: attach BatchSpanProcessor to existing provider
else no API key
RLS-->>App: enter no-op mode (zero spans, no raises)
end
RLS->>OTel: start root span, attach to context
RLS-->>App: yield self
loop per turn
App->>RLS: await log_turn(user_transcript, agent_transcript, model, latency_ms)
RLS->>OTel: create + end child "realtime_turn" span with attributes
OTel->>LW: BatchSpanProcessor.on_end (export, failures swallowed)
end
App->>RLS: exit async with
RLS->>OTel: end root span, detach context token
RLS->>OTel: force_flush() if session owns provider
Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
No description provided. |
Review verdict: READYAll CI checks pass. Both Fix concerns resolved. All 14 ACs covered. PR is assigned and human reviewer requested. Fixes applied since
|
Live behavior demonstration
Confirmed:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
specs/realtime-live-tracing.feature (1)
67-70: ⚡ Quick winAC6 is over-coupled to a non-production failure hook.
Line 69 hardcodes
on_endfailure, but the live path usesBatchSpanProcessor+ OTLP exporter; this AC can pass while the real export path regresses. Consider rewording AC6 to validate exporter/BatchSpanProcessor failure containment, then align the test to that path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@specs/realtime-live-tracing.feature` around lines 67 - 70, The test for AC6 (Scenario: OTLP export failure does not propagate to the live app) is mocking the span processor's on_end method to raise RuntimeError, but this does not reflect the real production failure path which uses BatchSpanProcessor with OTLP exporter. Reword the scenario description to explicitly state it validates exporter/BatchSpanProcessor failure containment, then update the test setup to mock the actual OTLP exporter failure or BatchSpanProcessor failure instead of the on_end hook, ensuring the test validates the real export path rather than a non-production failure point.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@python/scenario/_tracing/live.py`:
- Around line 57-86: The _add_langwatch_exporter function registers a new
BatchSpanProcessor unconditionally each time it's called, causing duplicate
processors and resource leaks when the same provider is reused across multiple
sessions. Add an idempotent guard before the provider.add_span_processor call to
check whether a BatchSpanProcessor has already been registered on the provider
instance. If one already exists, return early without adding another. This
ensures that repeated invocations (such as from multiple __aenter__ calls in
realtime_langwatch_session) don't accumulate duplicate processors.
In `@python/tests/test_live_tracing.py`:
- Around line 33-35: The module-level import of scenario at lines 33-35 prevents
the test_ac9_importing_scenario_does_not_create_tracer_provider test from
validating fresh-import behavior because scenario is already loaded before the
reset_otel fixture runs. Remove the module-level imports of scenario and
RealtimeLangWatchSession from lines 33-35, then modify the
test_ac9_importing_scenario_does_not_create_tracer_provider test (currently at
lines 93-99) to import scenario fresh within the test body itself, either by
clearing sys.modules['scenario'] and reimporting or by using an isolated
subprocess approach similar to how test_ac1_importable_from_scenario_package
correctly tests fresh import at line 86. This ensures the test can detect if
scenario creates a TracerProvider at import time.
---
Nitpick comments:
In `@specs/realtime-live-tracing.feature`:
- Around line 67-70: The test for AC6 (Scenario: OTLP export failure does not
propagate to the live app) is mocking the span processor's on_end method to
raise RuntimeError, but this does not reflect the real production failure path
which uses BatchSpanProcessor with OTLP exporter. Reword the scenario
description to explicitly state it validates exporter/BatchSpanProcessor failure
containment, then update the test setup to mock the actual OTLP exporter failure
or BatchSpanProcessor failure instead of the on_end hook, ensuring the test
validates the real export path rather than a non-production failure point.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: faa58f40-8c42-4824-bc1f-6a6595deddfb
📒 Files selected for processing (5)
docs/docs/pages/voice/happy-path-openai-realtime.mdxpython/scenario/__init__.pypython/scenario/_tracing/live.pypython/tests/test_live_tracing.pyspecs/realtime-live-tracing.feature
- docs: add "Getting LangWatch traces from your live app" section - specs: add realtime-live-tracing.feature covering AC1-AC14
- new scenario/_tracing/live.py with RealtimeLangWatchSession - export realtime_langwatch_session from scenario package - unit tests covering AC1-AC14 in python/tests/test_live_tracing.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eturn type - Remove self._entered flag (set but never read) - Wire self._model as root span attribute when provided (was stored but never used; log_turn still accepts per-turn model override) - Change __aexit__ return annotation from -> bool to -> None and drop explicit `return False` (None is falsy, same semantics, matches codebase convention) - Remove redundant span.name assertion in test_ac2 (span already filtered by name) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8899f9e to
603c28a
Compare
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Consolidate to single import style for scenario._tracing.setup - Move module-level `import scenario` to file scope (was local in test_ac1) - Guard _add_langwatch_exporter against duplicate BatchSpanProcessor registration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
python/scenario/_tracing/live.py (2)
33-33:⚠️ Potential issue | 🟠 MajorAdd explicit parameter types to satisfy strict typing requirements.
Line 33:
providerparameter in_get_concrete_provider()is untyped.
Line 235:exc_type,exc, andtbparameters in__aexit__()are untyped.These violations conflict with coding guidelines requiring explicit type annotations for all function parameters and strict pyright compatibility.
Suggested patch
+from types import TracebackType from typing import Optional from opentelemetry import context, trace from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import SpanExporter logger = logging.getLogger("scenario.tracing") -def _get_concrete_provider(provider) -> Optional[TracerProvider]: +def _get_concrete_provider(provider: object) -> Optional[TracerProvider]: """Return the concrete ``TracerProvider`` if one exists, else ``None``. Checks the provider itself and one level of delegation (the ``ProxyTracerProvider`` pattern OTel uses before any provider is set). NOTE: intentionally duplicated from ``scenario._tracing.setup`` — see the module docstring for why this file must not import from setup. """- async def __aexit__(self, exc_type, exc, tb) -> None: + async def __aexit__( + self, + exc_type: type[BaseException] | None, + exc: BaseException | None, + tb: TracebackType | None, + ) -> None:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@python/scenario/_tracing/live.py` at line 33, Add explicit type annotations to the untyped parameters in two functions to satisfy strict typing requirements. In the `_get_concrete_provider()` function, add a type annotation to the `provider` parameter (specify the appropriate type based on how it's used in the function). In the `__aexit__()` method, add type annotations to the three parameters `exc_type`, `exc`, and `tb` to properly document the exception handling context (these typically follow the standard exception handler signature with Optional types for exception values).Source: Coding guidelines
137-150:⚠️ Potential issue | 🟠 MajorAnnotate instance attributes explicitly in
__init__for pyright strict compliance.
self._tracer,self._root_span, andself._ctx_tokenlack explicit type annotations. Per coding guidelines, class attributes must be explicitly annotated and code must pass pyright strict mode without errors.Suggested patch
+from opentelemetry.context.context import Context, Token +from opentelemetry.trace import Span, Tracer @@ self._provider: Optional[TracerProvider] = None self._own_provider = False # did WE create/install the provider? - self._tracer = None - self._root_span = None - self._ctx_token = None + self._tracer: Tracer | None = None + self._root_span: Span | None = None + self._ctx_token: Token[Context] | None = None🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@python/scenario/_tracing/live.py` around lines 137 - 150, Add explicit type annotations to the three instance attributes that are currently initialized to None without type annotations in the __init__ method: self._tracer, self._root_span, and self._ctx_token. Each should be annotated as Optional[AppropriateType] where AppropriateType corresponds to the actual type that will be assigned to each attribute later in the lifecycle methods. This ensures the code complies with pyright strict mode, consistent with how self._provider is already annotated as Optional[TracerProvider].
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@python/scenario/_tracing/live.py`:
- Line 33: Add explicit type annotations to the untyped parameters in two
functions to satisfy strict typing requirements. In the
`_get_concrete_provider()` function, add a type annotation to the `provider`
parameter (specify the appropriate type based on how it's used in the function).
In the `__aexit__()` method, add type annotations to the three parameters
`exc_type`, `exc`, and `tb` to properly document the exception handling context
(these typically follow the standard exception handler signature with Optional
types for exception values).
- Around line 137-150: Add explicit type annotations to the three instance
attributes that are currently initialized to None without type annotations in
the __init__ method: self._tracer, self._root_span, and self._ctx_token. Each
should be annotated as Optional[AppropriateType] where AppropriateType
corresponds to the actual type that will be assigned to each attribute later in
the lifecycle methods. This ensures the code complies with pyright strict mode,
consistent with how self._provider is already annotated as
Optional[TracerProvider].
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 65a9be2e-c9b9-43af-8980-7facee03bfae
📒 Files selected for processing (2)
python/scenario/_tracing/live.pypython/tests/test_live_tracing.py
🚧 Files skipped from review as they are similar to previous changes (1)
- python/tests/test_live_tracing.py
…_session throughout Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…an assertion exc_info=True in __aenter__ could expose the Bearer token if the OTLP exporter constructor raises an exception whose repr includes the headers dict. Switch to logging type(exc).__name__ only. AC2 test now verifies the turn span's parent.span_id equals the root span's span_id, closing the gap where context-attachment could silently break without the test catching it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
parent_ctx and root_ctx are both Optional[SpanContext] per OTel type stubs; extract and assert-narrow each before accessing .span_id so pyright's reportOptionalMemberAccess check passes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Automated low-risk assessment This PR was evaluated against the repository's Low-Risk Pull Requests procedure and does not qualify as low risk.
This PR requires a manual review before merging. |
Why
Production OpenAI Realtime apps have no path into LangWatch today —
scenario.run()handles test-time tracing, but live calls fly blind. Closes #673.What changed
RealtimeLangWatchSession(exported asrealtime_langwatch_session) — async context manager that wraps a live Realtime session and records one child LLM span perlog_turn()call, matching the per-turn shape scenario tests already produce.LANGWATCH_API_KEYis absent the context manager is a safe no-op (no raise, no spans); safe to ship in keyless envs without a conditional.langwatch.setup()orscenario.run()already installed aTracerProvider, the helper attaches to it; idempotent guard prevents duplicateBatchSpanProcessorregistration across sequential sessions.specs/realtime-live-tracing.feature, covered by 20 unit tests intests/test_live_tracing.py; no live API calls needed.How it works
__aenter__checks the global OTel provider: fresh → builds + installs its ownTracerProvider; existing → attaches an OTLP exporter. A root span is opened for the session;log_turn()creates a childrealtime_turnspan withtype=llm,input,output,model,latency_ms.__aexit__ends the root span and force-flushes only provider it owns.Test plan
How I can prove I was successful
No playable artifact — see Test plan. 20 unit tests pass covering all 14 ACs; pyright clean; all CI checks green.
Anything surprising?
The live module intentionally duplicates two small helpers from
scenario._tracing.setup(_get_concrete_provider,_add_langwatch_exporter). The design constraint is thatscenario._tracing.setupis the test-runner's lazy-init path — importing it from a production context would be wrong. The duplication is documented in both files.