Skip to content

track user bot turn metrics#817

Open
sharifajahanshaik wants to merge 1 commit into
juspay:releasefrom
sharifajahanshaik:show-metrics-for-user-bot-turn
Open

track user bot turn metrics#817
sharifajahanshaik wants to merge 1 commit into
juspay:releasefrom
sharifajahanshaik:show-metrics-for-user-bot-turn

Conversation

@sharifajahanshaik

@sharifajahanshaik sharifajahanshaik commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

Release Notes

  • New Features
    • Enhanced voice agent with performance metrics collection to track processing times and conversation performance data across all stages.

Copilot AI review requested due to automatic review settings June 11, 2026 16:45

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

A new MetricsCollectorProcessor frame processor is added to the Breeze Buddy voice agent pipeline. It aggregates per-turn Pipecat timing metrics (TTFB, processing, text aggregation) across conversational turns, is wired into both realtime and non-realtime pipeline paths, and its collected data is persisted to context.lead.metaData["pipecat_metrics"] at conversation end.

Changes

Pipecat Metrics Collection and Persistence

Layer / File(s) Summary
MetricsCollectorProcessor implementation
app/ai/voice/agents/breeze_buddy/processors/metrics_collector_processor.py
New FrameProcessor subclass that deduplicates MetricsFrame events by id, extracts TTFBMetricsData, ProcessingMetricsData, and TextAggregationMetricsData into per-processor millisecond fields, commits accumulated data as a turn record on BotStoppedSpeakingFrame, and exposes the full turn list via get_metrics().
Pipeline wiring
app/ai/voice/agents/breeze_buddy/agent/pipeline.py
Imports and instantiates MetricsCollectorProcessor, inserts it into the realtime path (after the LLM) and both non-realtime paths (agent and stream modes), expands build_pipeline's return type annotation and docstring from 6 to 7 elements, and reworks get_observers to always include MetricsLogObserver while keeping LLM/transcription/turn-tracking observers gated to dev.
Agent unpacking and end-conversation persistence
app/ai/voice/agents/breeze_buddy/agent/__init__.py, app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py
Agent.run unpacks the new metrics_collector from build_pipeline's expanded tuple. end_conversation conditionally calls metrics_collector.get_metrics() and stores the result in context.lead.metaData["pipecat_metrics"] before the database update.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 Hop, hop, through the pipeline I go,
Collecting each tick and each timing below,
TTFB, processing, text in a row,
Committed per turn as the conversation flows,
At call's end I stash what the metrics bestow! 📊

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'track user bot turn metrics' refers to the core change of collecting per-turn timing metrics, which is accurately reflected in the file changes that add metrics collection and storage.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py (1)

76-110: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Flush the speech gate before snapshotting history.

TranscriptionGateProcessor only copies buffered user_* / tts_* values into context.messages when a later frame triggers _flush_timestamps(). If the call ends right after the last UserStoppedSpeakingFrame or BotStoppedSpeakingFrame, end_conversation() serializes history before that buffered state is materialized, so the last-turn metrics are missing from metaData["transcription"]. The EndFrame queued in finally is too late to rescue it. TemplateContext already exposes speech_gate, so flush it immediately before iterating context.context.messages.

🧩 Minimal fix shape
         if context.context:
+            if context.speech_gate is not None:
+                context.speech_gate.flush_timestamps()
             history = context.context.messages

Add a small public flush_timestamps() wrapper on TranscriptionGateProcessor rather than calling the private helper directly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py`
around lines 76 - 110, The history snapshot misses the last-turn user/tts
timestamps because the TranscriptionGateProcessor only materializes buffered
timestamps when its private flush runs; add a public flush_timestamps() method
on TranscriptionGateProcessor that invokes the existing private
_flush_timestamps(), then call that new method on the TemplateContext's
speech_gate (e.g., context.context.speech_gate.flush_timestamps()) immediately
before reading context.context.messages in end_conversation (the loop that
builds transcription/filtered_transcript) so buffered
user_start/user_end/tts_start/tts_end values are materialized into
context.messages before serialization into
context.lead.metaData["transcription"].
🧹 Nitpick comments (1)
app/ai/voice/agents/breeze_buddy/processors/transcription_gate.py (1)

184-216: ⚡ Quick win

Don't swallow timestamp flush failures.

If context.messages[-1] ever stops being a mutable dict or a malformed message slips through, this except Exception: pass turns the new turn-metrics path into a silent no-op. Catch only the expected "nothing to flush" cases and log unexpected failures so missing metrics are diagnosable.

♻️ Suggested tightening
-        except Exception:
-            # If context is empty or uninitialized, do nothing safely
-            pass
+        except (AttributeError, IndexError, TypeError):
+            logger.debug(
+                "TranscriptionGate: skipping timestamp flush; no writable message"
+            )
+        except Exception:
+            logger.exception(
+                "TranscriptionGate: unexpected failure while flushing timestamps"
+            )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/processors/transcription_gate.py` around
lines 184 - 216, The _flush_timestamps method currently swallows all exceptions,
hiding real errors; change the try/except to only guard the expected "nothing to
flush" conditions (e.g., check self.context and getattr(self.context,
"messages", None) before accessing messages and bail early) and remove the broad
except Exception: pass; instead catch and handle specific exceptions (like
IndexError or AttributeError when self.context.messages is empty or not a
list/dict) and log unexpected failures via the module logger (include details of
self.context, msg and exception) so failures in manipulating msg (the last
message) or missing attributes (user_start/user_end/tts_start/tts_end) are
visible for debugging.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py`:
- Around line 76-110: The history snapshot misses the last-turn user/tts
timestamps because the TranscriptionGateProcessor only materializes buffered
timestamps when its private flush runs; add a public flush_timestamps() method
on TranscriptionGateProcessor that invokes the existing private
_flush_timestamps(), then call that new method on the TemplateContext's
speech_gate (e.g., context.context.speech_gate.flush_timestamps()) immediately
before reading context.context.messages in end_conversation (the loop that
builds transcription/filtered_transcript) so buffered
user_start/user_end/tts_start/tts_end values are materialized into
context.messages before serialization into
context.lead.metaData["transcription"].

---

Nitpick comments:
In `@app/ai/voice/agents/breeze_buddy/processors/transcription_gate.py`:
- Around line 184-216: The _flush_timestamps method currently swallows all
exceptions, hiding real errors; change the try/except to only guard the expected
"nothing to flush" conditions (e.g., check self.context and
getattr(self.context, "messages", None) before accessing messages and bail
early) and remove the broad except Exception: pass; instead catch and handle
specific exceptions (like IndexError or AttributeError when
self.context.messages is empty or not a list/dict) and log unexpected failures
via the module logger (include details of self.context, msg and exception) so
failures in manipulating msg (the last message) or missing attributes
(user_start/user_end/tts_start/tts_end) are visible for debugging.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2fbc1d30-a9f5-45e2-895a-fc6f11b485d7

📥 Commits

Reviewing files that changed from the base of the PR and between d602baf and 702ddfa.

📒 Files selected for processing (3)
  • app/ai/voice/agents/breeze_buddy/agent/pipeline.py
  • app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py
  • app/ai/voice/agents/breeze_buddy/processors/transcription_gate.py

@narsimhaReddyJuspay

Copy link
Copy Markdown
Contributor

PR #817 — track user/bot turn metrics

Verdict: no blocking issues — looks good. Gates all green: black/isort/autoflake/pyrefly ✅, pytest 424 passed.

What it actually does

Despite the title, this PR does not count turns or write to a metrics table — it injects four ISO timestamps (user_start, user_end, tts_start, tts_end) into the LLMContext message dicts inside TranscriptionGateProcessor, which end_conversation later copies into lead.metaData["transcription"] entries. So it's per-turn timing data for analytics, not turn counters. (If turn counts are intended, that work isn't in this PR.)

Checked and cleared

  • No double/under-counting risk — there is no counter.
  • No teardown/SQL risk at end_conversation — the extra keys are just copied via key in msg; drain_voice_into_chat_session, summarize_transcription, etc. read only role/content and ignore them.
  • The mute/keyword suppression logic is preserved (separate top-level branch, same fall-through) — no turn-detection regression.
  • 032_create_chat_turn_metrics.sql is unrelated (chat-mode UI latency, keyed by session_id/idx) and is not written by this PR.

🟨 Minor (not blocking)

  • _seen_frame_ids.clear() wipes all ids at >1000 (coarse reset, not LRU) — low impact, but a bounded FIFO would be more robust on the hot frame path.
  • now = self._now_iso() is computed on every frame but only consumed in a few branches — compute it lazily where needed.
  • except Exception: pass in _flush_timestamps silently hides real bugs — log at debug.
  • Timestamps attach to context.messages[-1] only, so the final user turn (no bot reply) may never get user_start/user_end flushed — consider also flushing on EndFrame.

If the intent really is turn counts (the title suggests so), flag that this PR delivers timestamps only.

@sharifajahanshaik sharifajahanshaik force-pushed the show-metrics-for-user-bot-turn branch from 702ddfa to 4db51e6 Compare June 15, 2026 17:20
@sharifajahanshaik

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py`:
- Around line 221-225: The metrics collection block starting with the hasattr
check on context.bot.metrics_collector is not isolated from the outer try block,
so if get_metrics() throws an exception, it will abort the entire DB update and
callback execution in the finalization path. Wrap the metrics collection logic
(the entire if block containing the hasattr check and the assignment to
context.lead.metaData) in its own try-except handler that catches any exceptions
from get_metrics(), logs the error, and continues execution. This makes the
metrics persistence best-effort and prevents it from disrupting the critical
call finalization steps.

In `@app/ai/voice/agents/breeze_buddy/processors/metrics_collector_processor.py`:
- Around line 18-19: The self._frames_seen set in the
metrics_collector_processor initialization is storing frame IDs indefinitely
without any bounds, causing unbounded memory growth on long-running calls.
Replace the unbounded set with a bounded data structure that enforces a maximum
size limit (such as an LRU cache, a deque with maxlen, or a fixed-size circular
buffer). Ensure this bounded structure is used consistently wherever frame
deduplication occurs in the processor to prevent memory degradation over time
during extended voice agent calls.
- Around line 14-21: Add missing type annotations to three methods in the
metrics_collector_processor class. In the `__init__` method, annotate the
`**kwargs` parameter as `**kwargs: Any` and add `-> None` return type
annotation. In the `process_frame` method, add the `-> None` return type
annotation. In the `_commit_turn` method, add the `-> None` return type
annotation. These changes align with the coding guidelines requiring type hints
on all function signatures.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ffa358b0-363c-4211-a5a9-d261a68eec42

📥 Commits

Reviewing files that changed from the base of the PR and between 702ddfa and 4db51e6.

📒 Files selected for processing (4)
  • app/ai/voice/agents/breeze_buddy/agent/__init__.py
  • app/ai/voice/agents/breeze_buddy/agent/pipeline.py
  • app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py
  • app/ai/voice/agents/breeze_buddy/processors/metrics_collector_processor.py

Comment on lines +221 to +225
# Store Pipecat metrics collected during the call
if hasattr(context.bot, "metrics_collector") and context.bot.metrics_collector:
context.lead.metaData["pipecat_metrics"] = (
context.bot.metrics_collector.get_metrics()
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Isolate metrics write failures from call finalization path.

If get_metrics() throws here, the outer try aborts DB update and callback execution for end-of-call finalization. This new persistence step should be best-effort.

Proposed fix
         # Store Pipecat metrics collected during the call
         if hasattr(context.bot, "metrics_collector") and context.bot.metrics_collector:
-            context.lead.metaData["pipecat_metrics"] = (
-                context.bot.metrics_collector.get_metrics()
-            )
+            try:
+                context.lead.metaData["pipecat_metrics"] = (
+                    context.bot.metrics_collector.get_metrics()
+                )
+            except Exception as metrics_error:
+                logger.warning(
+                    f"Failed to collect pipecat metrics for call {context.call_sid}: {metrics_error}"
+                )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py`
around lines 221 - 225, The metrics collection block starting with the hasattr
check on context.bot.metrics_collector is not isolated from the outer try block,
so if get_metrics() throws an exception, it will abort the entire DB update and
callback execution in the finalization path. Wrap the metrics collection logic
(the entire if block containing the hasattr check and the assignment to
context.lead.metaData) in its own try-except handler that catches any exceptions
from get_metrics(), logs the error, and continues execution. This makes the
metrics persistence best-effort and prevents it from disrupting the critical
call finalization steps.

Comment on lines +14 to +21
def __init__(self, **kwargs):
super().__init__(**kwargs)
self._turns: list[Dict[str, Any]] = []
self._current_turn_metrics = defaultdict(dict)
self._frames_seen = set()
self._turn_count = 1

async def process_frame(self, frame: Frame, direction: FrameDirection):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import ast
from pathlib import Path

p = Path("app/ai/voice/agents/breeze_buddy/processors/metrics_collector_processor.py")
tree = ast.parse(p.read_text())

for node in ast.walk(tree):
    if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
        if node.returns is None:
            print(f"Missing return annotation: Line {node.lineno} -> {node.name}")
        if node.name == "__init__":
            for arg in node.args.kwarg,:
                if arg is not None and arg.annotation is None:
                    print(f"Missing annotation for **{arg.arg}: Line {arg.lineno}")
PY

Repository: juspay/clairvoyance

Length of output: 254


🏁 Script executed:

cat -n app/ai/voice/agents/breeze_buddy/processors/metrics_collector_processor.py

Repository: juspay/clairvoyance

Length of output: 2907


Add missing type annotations to method signatures.

Methods in this processor are missing return type annotations. Add -> None to __init__ (line 14), process_frame (line 21), and _commit_turn (line 46). Also annotate **kwargs on line 14 as **kwargs: Any.

Per coding guidelines: "Add type hints on all function signatures."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/processors/metrics_collector_processor.py`
around lines 14 - 21, Add missing type annotations to three methods in the
metrics_collector_processor class. In the `__init__` method, annotate the
`**kwargs` parameter as `**kwargs: Any` and add `-> None` return type
annotation. In the `process_frame` method, add the `-> None` return type
annotation. In the `_commit_turn` method, add the `-> None` return type
annotation. These changes align with the coding guidelines requiring type hints
on all function signatures.

Source: Coding guidelines

Comment on lines +18 to +19
self._frames_seen = set()
self._turn_count = 1

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bound dedupe state to avoid unbounded memory growth.

Line 18 + Lines 27-29 keep every processed frame id forever. On long-running calls this grows without limit and can degrade worker memory.

Proposed fix
-from collections import defaultdict
+from collections import defaultdict, deque
@@
 class MetricsCollectorProcessor(FrameProcessor):
     def __init__(self, **kwargs):
         super().__init__(**kwargs)
         self._turns: list[Dict[str, Any]] = []
         self._current_turn_metrics = defaultdict(dict)
-        self._frames_seen = set()
+        self._frames_seen: set[Any] = set()
+        self._frames_seen_order: deque[Any] = deque()
+        self._max_seen_frames = 5000
         self._turn_count = 1
@@
         elif isinstance(frame, MetricsFrame):
-            if frame.id not in self._frames_seen:
-                self._frames_seen.add(frame.id)
+            if frame.id not in self._frames_seen:
+                self._frames_seen.add(frame.id)
+                self._frames_seen_order.append(frame.id)
+                if len(self._frames_seen_order) > self._max_seen_frames:
+                    oldest = self._frames_seen_order.popleft()
+                    self._frames_seen.discard(oldest)
                 for data in frame.data:

Also applies to: 27-29

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/processors/metrics_collector_processor.py`
around lines 18 - 19, The self._frames_seen set in the
metrics_collector_processor initialization is storing frame IDs indefinitely
without any bounds, causing unbounded memory growth on long-running calls.
Replace the unbounded set with a bounded data structure that enforces a maximum
size limit (such as an LRU cache, a deque with maxlen, or a fixed-size circular
buffer). Ensure this bounded structure is used consistently wherever frame
deduplication occurs in the processor to prevent memory degradation over time
during extended voice agent calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants