Skip to content

Conversation

@vipyne
Copy link

@vipyne vipyne commented Dec 18, 2025

  • better handling of actual TTS output
    (so turn output better reflects what was "heard" and TTS output shows entirety of TTS response)
  • try using TOOL spans for STT and TTS
    (not sure how I feel about this- but I do like that llm.token_count.total.: 0 is no longer present)

Note

Switches STT/TTS to TOOL spans, adds BaseOutputTransport TTS handling, refines text accumulation/lifecycle, and records STT timing/minutes.

  • Observer (_observer.py):
    • Handle TTSTextFrame from BaseOutputTransport to populate turn output; keep TTS span accumulation only when source is TTSService.
    • Start/track spans by detected service_type and defer finishing STT/TTS spans to turn end (LLM still ends on LLMFullResponseEndFrame).
    • For STT: set stt.time_to_last_transcription_seconds on final transcription and record stt.minutes for the whole turn.
    • Maintain inter-frame spacing flags and accumulate output/input accordingly.
  • Attributes (_attributes.py):
    • Map BaseOutputTransport to "tts" in SERVICE_TYPE_MAP.
    • Change STT/TTS OPENINFERENCE_SPAN_KIND to TOOL and use service.model (remove LLM-specific model/provider attrs for these services).

Written by Cursor Bugbot for commit ed37ee3. This will update automatically on new commits. Configure here.

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Dec 18, 2025
@vipyne vipyne force-pushed the vp-incorporate-pipecat-improvements branch from 4786575 to ed37ee3 Compare December 18, 2025 20:31
@vipyne
Copy link
Author

vipyne commented Dec 18, 2025

cc @duncankmckinnon

service_id = id(service)
frame = data.frame
service_type = detect_service_type(service)
service_id = id(service_type)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Service ID incorrectly uses type string instead of instance

The service_id computation changed from id(service) to id(service_type). Since service_type is a string ("stt", "tts", "llm"), and Python interns short strings, all services of the same type now share the same service_id. This causes span tracking collisions in _active_spans. Additionally, since BaseOutputTransport maps to "tts" in SERVICE_TYPE_MAP, both TTSService and BaseOutputTransport instances will have identical service IDs, causing their spans to interfere with each other.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant