Skip to content

feat: add real-time observer system for voicemail/hallucination detec…#828

Open
Dev-Bhumika03 wants to merge 1 commit into
juspay:releasefrom
Dev-Bhumika03:BZ-3716-side-llm-observers-support
Open

feat: add real-time observer system for voicemail/hallucination detec…#828
Dev-Bhumika03 wants to merge 1 commit into
juspay:releasefrom
Dev-Bhumika03:BZ-3716-side-llm-observers-support

Conversation

@Dev-Bhumika03

@Dev-Bhumika03 Dev-Bhumika03 commented Jun 14, 2026

Copy link
Copy Markdown

…tion

  • Add ObserverConfig type to template configurations (reuses LLMConfiguration, FlowAction)
  • Add observers package: RealtimeObserver, ObserverManager, factory
  • Wire observer lifecycle in agent/init.py (on_user_turn_started, on_function_calls_started)
  • Observers read from LLMContext, run in parallel via asyncio.gather, first-writer-wins
  • Uses existing get_llm_service(pooled=True) and Pipecat run_inference() — no custom HTTP clients
  • Template-configurable: add any detection by writing a system_prompt, zero code changes
  • Tested with real voicemail calls — observer detects and sets outcome=VOICEMAIL
Screenshot 2026-06-14 at 9 02 59 PM - Screenshot 2026-06-15 at 12 37 00 PM

Summary by CodeRabbit

  • New Features
    • Added real-time observers that monitor conversation transcripts during calls
    • Observers can automatically detect issues and trigger actions (e.g., ending conversations)
    • Support for multiple concurrent observers with customizable detection rules
    • Configurable observer activation timing, detection instructions, and outcome handling

Copilot AI review requested due to automatic review settings June 14, 2026 15:32
@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7829b7ee-44ce-46c3-93b0-d2dec37e089a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Introduces a real-time observer subsystem for the breeze_buddy agent. A new ObserverConfig Pydantic model and observers field on ConfigurationModel define per-observer settings. RealtimeObserver runs a side-LLM inference call to detect conversation problems and execute actions. ObserverManager coordinates multiple observers across turns. The factory constructs observers with merged LLM configs, and the Agent lifecycle is extended to initialize, notify, and stop the observer manager.

Changes

Real-time observer subsystem

Layer / File(s) Summary
ObserverConfig schema and ConfigurationModel field
app/ai/voice/agents/breeze_buddy/template/types.py
Adds the ObserverConfig Pydantic model (name, system_prompt, start_after_turn, outcome, llm override, action) and an optional observers list field on ConfigurationModel.
RealtimeObserver detection and action execution
app/ai/voice/agents/breeze_buddy/observers/observer.py
Implements RealtimeObserver with check() calling run_inference(), parsing a JSON {detected, reason} response, and execute_action() routing end_conversation or generic function handlers through handler_map.
Observer factory and package exports
app/ai/voice/agents/breeze_buddy/observers/factory.py, app/ai/voice/agents/breeze_buddy/observers/__init__.py
Adds merge_llm_config (defaults: gpt-4o-mini, temp 0.1, max_tokens=100) and async build_observers to create pooled LLM services and instantiate observers; re-exports all three public symbols.
ObserverManager coordination and transcript building
app/ai/voice/agents/breeze_buddy/observers/manager.py
Implements ObserverManager with turn counting, function-call recording, asyncio.Lock-serialized _run_checks(), concurrent asyncio.gather observer checks, first-trigger action dispatch, transcript assembly from LLMContext messages, and stop() cleanup.
Agent lifecycle wiring
app/ai/voice/agents/breeze_buddy/agent/__init__.py
Adds _observer_manager field, wires on_turn_completed and on_function_call into LLM event handlers, constructs observers in run() from self.configurations.observers (non-stream only), and stops/clears the manager in the finally block.

Sequence Diagram

sequenceDiagram
  participant Agent
  participant build_observers
  participant ObserverManager
  participant RealtimeObserver
  participant SideLLM

  rect rgba(100, 149, 237, 0.5)
    note over Agent,build_observers: Initialization (run())
    Agent->>build_observers: configs, template, agent_context, handler_map
    build_observers->>build_observers: merge_llm_config per observer
    build_observers->>SideLLM: get_llm_service(pooled=True)
    build_observers-->>Agent: List[RealtimeObserver]
    Agent->>ObserverManager: __init__(observers, llm_context)
  end

  rect rgba(144, 238, 144, 0.5)
    note over Agent,RealtimeObserver: Per-turn and function-call evaluation
    Agent->>ObserverManager: on_turn_completed()
    ObserverManager->>ObserverManager: _run_checks() [asyncio task]
    ObserverManager->>RealtimeObserver: check(transcript) [gather]
    RealtimeObserver->>SideLLM: run_inference(system_prompt, transcript)
    SideLLM-->>RealtimeObserver: {detected: true, reason: "..."}
    RealtimeObserver-->>ObserverManager: True
    ObserverManager->>RealtimeObserver: execute_action()
    RealtimeObserver->>Agent: handler_map["end_conversation"]()
  end

  rect rgba(255, 160, 122, 0.5)
    note over Agent,ObserverManager: Shutdown
    Agent->>ObserverManager: stop()
    Agent->>Agent: _observer_manager = None
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 A watchful ear upon each call,
A side-LLM listening through it all,
Turn by turn the transcript grows,
The observer checks — detection glows,
"End conversation!" the rabbit knows.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: introducing a real-time observer system for detecting voicemail and hallucination events, which aligns with the core additions and objectives of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@Dev-Bhumika03 Dev-Bhumika03 force-pushed the BZ-3716-side-llm-observers-support branch from e32b969 to b907b75 Compare June 14, 2026 15:35

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
app/ai/voice/agents/breeze_buddy/observers/observer.py (1)

30-36: ⚡ Quick win

Add explicit return type hints on new method signatures.

__init__ and execute_action are missing return annotations.

Proposed fix
     def __init__(
         self,
         config: ObserverConfig,
         llm_service: Any,
         agent_context: Any,
         handler_map: Dict[str, Any],
-    ):
+    ) -> None:
@@
-    async def execute_action(self):
+    async def execute_action(self) -> None:

As per coding guidelines, "Add type hints on all function signatures."

Also applies to: 89-90

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/observers/observer.py` around lines 30 - 36,
Add explicit return type hints to the method signatures in the Observer class.
The `__init__` method should include a `-> None` return type annotation.
Additionally, the `execute_action` method (which the comment also references)
needs an appropriate return type hint added based on what it returns. These
return type annotations are required by the coding guidelines to ensure all
function signatures have complete type hints for better code clarity and type
safety.

Source: Coding guidelines

app/ai/voice/agents/breeze_buddy/observers/manager.py (1)

45-52: ⚡ Quick win

Add return type annotations on new manager methods.

New methods are missing explicit return types (-> None).

Proposed fix
-    def on_turn_completed(self):
+    def on_turn_completed(self) -> None:
@@
-    def on_function_call(self, function_name: str, arguments: Any):
+    def on_function_call(self, function_name: str, arguments: Any) -> None:
@@
-    async def _run_checks(self):
+    async def _run_checks(self) -> None:
@@
-    async def stop(self):
+    async def stop(self) -> None:

As per coding guidelines, "Add type hints on all function signatures."

Also applies to: 65-67, 130-131

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/observers/manager.py` around lines 45 - 52,
Add explicit return type annotations to new methods that are missing them. The
method on_turn_completed and other methods (at lines 65-67 and 130-131 in the
manager.py file) do not include return type hints. Add `-> None` return type
annotation to each method signature that lacks one to comply with the coding
guideline requiring type hints on all function signatures. Ensure all methods in
the manager.py file have explicit return types declared.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/ai/voice/agents/breeze_buddy/observers/factory.py`:
- Around line 25-34: In the merge_llm_config() function's no-override branch
(when override is None), add the missing region parameter to the
LLMConfiguration constructor call by including region=base.region alongside the
other parameters like provider, sdk, model, endpoint, api_key_name, temperature,
and max_tokens to ensure region information is preserved and consistent with the
override branch behavior.

In `@app/ai/voice/agents/breeze_buddy/observers/manager.py`:
- Around line 85-99: The current implementation using asyncio.gather() waits for
all observer checks to complete, then selects the first True result based on the
eligible list's config order, not the actual completion order. To implement true
first-writer-wins behavior where the observer that completes its check first
wins, replace the asyncio.gather() call with asyncio.as_completed() or
asyncio.wait() with return_when=asyncio.FIRST_COMPLETED. This will allow you to
process results as they complete rather than after all checks finish, ensuring
the observer that actually detects first (not the one earliest in the config)
gets to execute its action.

In `@app/ai/voice/agents/breeze_buddy/observers/observer.py`:
- Around line 72-75: The logger.info call in the observer.py file is logging raw
LLM-generated reason text from the model output, which can expose sensitive user
details and PII in application logs. Remove or replace the reason field logging
with a generic, non-sensitive message that does not include the actual
LLM-generated reason text. Ensure the log entry still conveys that the observer
was triggered but without exposing any sensitive details from the model output.

In `@app/ai/voice/agents/breeze_buddy/template/types.py`:
- Around line 1535-1563: Black code formatting has not been applied to several
files, causing CI build failures. Apply Black formatter with line-length=88
configuration to the following files at their specified line ranges:
app/ai/voice/agents/breeze_buddy/template/types.py (lines 1535-1563),
app/ai/voice/agents/breeze_buddy/observers/factory.py (lines 36-50),
app/ai/voice/agents/breeze_buddy/observers/manager.py (lines 52-59), and
app/ai/voice/agents/breeze_buddy/agent/__init__.py (lines 1127-1132). Run the
Black formatter on these files to reformat the code according to the
repository's coding guidelines, then commit the reformatted changes.
- Around line 1527-1563: The ObserverConfig.action field currently accepts any
FlowAction type, but RealtimeObserver.execute_action() only supports
end_conversation and function actions, causing invalid combinations to pass
validation and silently no-op at runtime. Constrain the action field in
ObserverConfig to only accept the action variants that are actually supported by
the runtime (end_conversation and function) by either creating a narrower type
definition or using a Union of only the supported action types, so that invalid
action types are rejected during validation rather than failing silently at
runtime.

---

Nitpick comments:
In `@app/ai/voice/agents/breeze_buddy/observers/manager.py`:
- Around line 45-52: Add explicit return type annotations to new methods that
are missing them. The method on_turn_completed and other methods (at lines 65-67
and 130-131 in the manager.py file) do not include return type hints. Add `->
None` return type annotation to each method signature that lacks one to comply
with the coding guideline requiring type hints on all function signatures.
Ensure all methods in the manager.py file have explicit return types declared.

In `@app/ai/voice/agents/breeze_buddy/observers/observer.py`:
- Around line 30-36: Add explicit return type hints to the method signatures in
the Observer class. The `__init__` method should include a `-> None` return type
annotation. Additionally, the `execute_action` method (which the comment also
references) needs an appropriate return type hint added based on what it
returns. These return type annotations are required by the coding guidelines to
ensure all function signatures have complete type hints for better code clarity
and type safety.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d2b4ccbb-2dac-4320-9314-4a8378557e07

📥 Commits

Reviewing files that changed from the base of the PR and between e003517 and e32b969.

📒 Files selected for processing (6)
  • app/ai/voice/agents/breeze_buddy/agent/__init__.py
  • app/ai/voice/agents/breeze_buddy/observers/__init__.py
  • app/ai/voice/agents/breeze_buddy/observers/factory.py
  • app/ai/voice/agents/breeze_buddy/observers/manager.py
  • app/ai/voice/agents/breeze_buddy/observers/observer.py
  • app/ai/voice/agents/breeze_buddy/template/types.py

Comment thread app/ai/voice/agents/breeze_buddy/observers/factory.py
Comment thread app/ai/voice/agents/breeze_buddy/observers/manager.py
Comment thread app/ai/voice/agents/breeze_buddy/observers/observer.py Outdated
Comment thread app/ai/voice/agents/breeze_buddy/template/types.py
Comment thread app/ai/voice/agents/breeze_buddy/template/types.py
@Dev-Bhumika03 Dev-Bhumika03 force-pushed the BZ-3716-side-llm-observers-support branch from b907b75 to 1b87a60 Compare June 14, 2026 17:50
@murdore

murdore commented Jun 15, 2026

Copy link
Copy Markdown

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
✅ Action performed

Full review finished.

Comment thread app/ai/voice/agents/breeze_buddy/agent/__init__.py Outdated
Comment thread app/ai/voice/agents/breeze_buddy/observers/manager.py
Comment thread app/ai/voice/agents/breeze_buddy/observers/observer.py Outdated
@Dev-Bhumika03 Dev-Bhumika03 force-pushed the BZ-3716-side-llm-observers-support branch 7 times, most recently from c468bb7 to 812ae66 Compare June 15, 2026 12:19
Comment thread app/ai/voice/agents/breeze_buddy/observers/manager.py Outdated
Comment thread app/ai/voice/agents/breeze_buddy/observers/observer.py Outdated
Comment thread app/ai/voice/agents/breeze_buddy/observers/observer.py
@narsimhaReddyJuspay

Copy link
Copy Markdown
Contributor

PR #828 — real-time observer system (voicemail / hallucination detection)

Verdict: request changes — 3 majors. Gates all green: black/isort/autoflake/pyrefly ✅, pytest 424 passed. The package is well-structured and the design (side-LLM per observer, run off the sync frame path) is sound, but the three issues below will make the feature silently unreliable or risky in production.

🟥 Major (inline comments posted)

  1. create_task with no reference → silent GC of in-flight observer checks (worse: PipelineRunner uses force_gc=True). (observers/manager.py:50)
  2. Cross-provider silent no-op: check() uses _client.chat.completions.create, which only exists on OpenAI/Azure; merge_llm_config inherits provider from the template base, so Gemini/Vertex/Claude templates raise → swallowed → observer never detects. (observers/observer.py:86)
  3. Single-shot action on a live call: the first detected=True (no consecutive-threshold/cooldown) runs execute_actionend_conversation / warm_transfer. One noisy LLM judgement can hang up on or transfer a real customer. (observers/observer.py:146)

🟨 Minor (not blocking)

  • _turn_count += 1 / _function_calls.append(...) mutate shared state outside _check_lock (race with the locked read in _build_transcript).
  • _build_transcript stringifies content that may be a list of parts (multimodal/tool) and drops falsy [].
  • _function_calls grows unbounded for the call duration; stop() doesn't cancel pending _run_checks tasks.
  • on_turn_completed() is wired to on_user_turn_started — fires on user speech-onset, so start_after_turn is effectively "after N user-turn-starts" (off-by-one vs the intuitive "after N completed turns"); worth documenting or renaming.
  • Fail-open (all observers fail to build → [], manager skipped) has no metric distinguishing "none configured" from "all failed to build" — a broken rollout is invisible.
  • Azure path uses model= = the deployment name, so it only works if a deployment literally named gpt-4o-mini exists.

✅ What's good

Observers are dispatched via create_task from event callbacks — they do not run on the synchronous STT/TTS/LLM frame path, so a failing observer can't stall audio or drop frames (worst case: silent non-detection). Stream/chat mode is correctly excluded (if observers_config and not is_stream). _action_taken/_detected correctly prevent double-execution. execute_action uses the existing handler_map (so with_context injection resolves the single-positional call).

@Dev-Bhumika03 Dev-Bhumika03 force-pushed the BZ-3716-side-llm-observers-support branch from 812ae66 to 441321b Compare June 15, 2026 16:23
…tion

Add ObserverConfig type to template configurations (reuses LLMConfiguration, FlowAction)
Add observers package: RealtimeObserver, ObserverManager, factory
Wire observer lifecycle in agent/init.py (on_user_turn_started, on_function_calls_started)
Observers read from LLMContext, run in parallel via asyncio.gather, first-writer-wins
Uses existing get_llm_service(pooled=True) and Pipecat run_inference() — no custom HTTP clients
Template-configurable: add any detection by writing a system_prompt, zero code changes
Tested with real voicemail calls — observer detects and sets outcome=VOICEMAIL
@Dev-Bhumika03 Dev-Bhumika03 force-pushed the BZ-3716-side-llm-observers-support branch from 441321b to f15f800 Compare June 16, 2026 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants