-
Notifications
You must be signed in to change notification settings - Fork 67
feat(#666): per-role voice modality negotiation — declaration-first, two-phase validation, OTEL stamps #670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 18 commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
e275957
feat(#666): resolver core — per-role modality resolution with declara…
8ba9995
Merge Bundle 1: resolver core (AC4a, AC4b)
a29822f
feat(#666): conditional audio strip in simulator — AC1, AC2
8e4976b
feat(#666): replace judge substring audio detection with modality res…
60a350e
feat(#666): two-phase modality validation — AC6, AC7, AC8a
0ae04f5
Merge Bundle 2: simulator wiring (AC1, AC2)
b641a3f
Merge Bundle 3: judge wiring (AC3a, AC3b, AC3c, AC9)
51d5994
Merge Bundle 4: two-phase validation (AC6, AC7, AC8a, AC8b)
2f93128
feat(#666): public modality= parameter on simulator and judge — AC0
ca3eca3
feat(#666): stamp resolved modality/tier per role as OTEL span attrib…
d7489b4
Merge Bundle 5: OTEL modality stamps (AC5, AC5b)
9634b69
Merge Bundle 6: public modality= parameter (AC0)
b243d7d
test(#666): verify capability matrix byte-identical when no capabilit…
e82baf8
fix(#666): emit resolve_modality warnings in executor — sweep must-fix
ae52191
chore(#666): remove unused mock imports flagged by code-quality bot
dd7bacc
fix(#666): add @unit tag to untagged feature scenario — fix pre-exist…
02eca46
fix(#666): suppress pre-existing pyright type errors in simulator tests
30aeb1f
fix(#666): narrow AC7 exception catch to PendingTransportError only
1447c52
fix(#666): address review — remove duplicate .resolved span attr, str…
a3a9f34
fix(#666): stamp scenario.modality.<role>.resolved + transcribe_segme…
0c84c38
fix(#666): fix pyright error in spy tests — cast audio message + remo…
f85c8be
fix(#666): rename unused warnings → _warnings to satisfy Ruff RUF059
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| """Per-role voice modality resolution. | ||
|
|
||
| Declaration-first: explicit per-role modality beats litellm advisory. | ||
| Advisory is used as a hint only; mismatch emits a WARNING. | ||
| """ | ||
| from __future__ import annotations | ||
| import logging | ||
| from enum import Enum | ||
| from typing import Optional | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class ModalityTier(str, Enum): | ||
| AUDIO_IN = "audio-in" # LLM receives raw audio parts | ||
| STT_BRIDGE = "stt-bridge" # audio -> STT -> text before LLM | ||
| TEXT = "text" # no audio in the stack | ||
|
|
||
|
|
||
| class ModalityNegotiationError(Exception): | ||
| """Raised when the declared modality is incompatible with adapter capabilities. | ||
|
|
||
| Message always contains both the declared modality string and the conflicting | ||
| capability value (e.g. 'realtime' and 'mulaw/8000'). | ||
| """ | ||
|
|
||
|
|
||
| def _litellm_advisory(model_id: str) -> bool: | ||
| """Return True if litellm believes model_id can ingest audio input.""" | ||
| try: | ||
| import litellm.utils | ||
| return bool(litellm.utils.supports_audio_input(model=model_id)) | ||
| except Exception: | ||
| return False | ||
|
|
||
|
|
||
| def resolve_modality( | ||
| *, | ||
| declaration: Optional[str], # None = no explicit declaration | ||
| model_id: str, | ||
| ) -> tuple[ModalityTier, list[str]]: | ||
| """Resolve the modality tier for a single role. | ||
|
|
||
| Returns (tier, warnings). Warnings are human-readable strings the caller | ||
| should emit via logger.warning(). | ||
|
|
||
| Resolution rules: | ||
| - If declaration is given AND litellm agrees -> use declared tier, no warning. | ||
| - If declaration is given AND litellm disagrees -> use declared tier, emit WARNING. | ||
| - If no declaration -> use litellm advisory as truth, no warning. | ||
| """ | ||
| advisory_audio = _litellm_advisory(model_id) | ||
|
|
||
| if declaration is None: | ||
| tier = ModalityTier.AUDIO_IN if advisory_audio else ModalityTier.TEXT | ||
| return tier, [] | ||
|
|
||
| # Normalize declaration string to ModalityTier | ||
| try: | ||
| declared_tier = ModalityTier(declaration) | ||
| except ValueError: | ||
| raise ModalityNegotiationError( | ||
| f"Unknown modality declaration {declaration!r}; valid values: " | ||
| + ", ".join(t.value for t in ModalityTier) | ||
| ) | ||
|
|
||
| warnings: list[str] = [] | ||
| declared_audio = declared_tier == ModalityTier.AUDIO_IN | ||
|
|
||
| if declared_audio and not advisory_audio: | ||
| warnings.append( | ||
| f"Model {model_id!r} declared modality 'audio-in' but litellm " | ||
| f"reports it does NOT support audio input. " | ||
| f"The declared modality 'audio-in' will be used. " | ||
| f"If this is wrong, remove the declaration or file a litellm issue." | ||
| ) | ||
| elif not declared_audio and advisory_audio: | ||
| warnings.append( | ||
| f"Model {model_id!r} declared modality {declaration!r} but litellm " | ||
| f"reports it DOES support audio input. " | ||
| f"The declared modality {declaration!r} will be used." | ||
| ) | ||
|
|
||
| return declared_tier, warnings | ||
|
|
||
|
|
||
| def validate_modality_setup( | ||
| *, | ||
| tier: ModalityTier, | ||
| adapter_input_formats: list[str], | ||
| adapter_name: str, | ||
| ) -> None: | ||
| """Raise ModalityNegotiationError if tier is statically incompatible with adapter. | ||
|
|
||
| 'audio-in' requires a pcm16-family input format. Adapters that only offer | ||
| mulaw/* (telephony) cannot pass audio directly to the LLM. | ||
| """ | ||
| if tier == ModalityTier.AUDIO_IN: | ||
| pcm_formats = [f for f in adapter_input_formats if f.startswith("pcm16")] | ||
| if adapter_input_formats and not pcm_formats: | ||
| # Has formats, none are pcm16-compatible — static impossible | ||
| raise ModalityNegotiationError( | ||
| f"Declared modality 'audio-in' is incompatible with adapter " | ||
| f"{adapter_name!r}: input formats {adapter_input_formats!r} " | ||
| f"contain no pcm16 path (conflicting capability: " | ||
| f"{adapter_input_formats[0]!r}). No resample path exists." | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.