fix: make speaker and confidence optional in TranscriptUtterance#12
fix: make speaker and confidence optional in TranscriptUtterance#12Odrec wants to merge 2 commits intonamastexlabs:mainfrom
Conversation
- server.py: use result.get('confidence') instead of result['confidence']
to prevent KeyError when word-level data is unavailable
- models.py: make TranscriptUtterance.speaker and .confidence optional
since format_result() only includes them conditionally (speaker requires
diarization, confidence requires word-level data)
- test_models.py: add tests for utterances without speaker/confidence
Summary of ChangesHello @Odrec, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses runtime errors caused by a mismatch between the Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly addresses two runtime errors by making the speaker and confidence fields optional in the TranscriptUtterance model and safely accessing the confidence value. The changes are logical and include corresponding test cases. I've suggested a minor improvement to one of the new tests to make the test suite more robust and clear by ensuring test cases are orthogonal.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
fix: make speaker and confidence optional in TranscriptUtterance
Problem
Two runtime errors occur when transcribing without speaker diarization or when word-level timestamps are unavailable:
KeyError: 'confidence'inserver.py:190—result["confidence"]crashes becauseformat_result()only includes theconfidencekey when word-level data is available (line 534 intranscriber.py)ResponseValidationErrorfrom Pydantic —TranscriptUtterancedeclaresspeaker: strandconfidence: floatas required fields, butformat_result()only adds them conditionally:speakeris only included when diarization is enabled and a speaker is detected (lines 503-504)confidenceis only included when word-level data exists (lines 507-510)Root Cause
The Pydantic model schema doesn't match the actual data produced by the transcription pipeline. When diarization is disabled, segments have no
speakerfield. Whenword_timestamps=false, segments have no word-level confidence scores.Fix
server.py: Useresult.get("confidence")instead ofresult["confidence"]to safely handle missing confidence data.models.py: Make both fields optional inTranscriptUtterance:speaker: str→speaker: str | None = Noneconfidence: float→confidence: float | None = Nonetest_models.py: Added 3 test cases:test_utterance_without_speaker— no diarizationtest_utterance_without_confidence— no word-level datatest_utterance_without_speaker_and_confidence— minimal fields onlyTesting
These fixes have been validated end-to-end on a GPU server with both diarization-enabled and diarization-disabled transcription requests.