TranslationRecognizer silently drops target languages when using overlapping or prefix-colliding language codes in add_target_language
IN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:
- Speech SDK log: Not yet captured. Can provide on request if needed. The issue is consistently reproducible with the combinations listed below.
- Simplified source code: Minimal reproduction script included below.
- WAV file: Not relevant — the issue occurs regardless of audio content. Any input audio that produces a successful
TranslatedSpeech result will reproduce the problem.
Describe the bug
When calling add_target_language() with multiple target languages on SpeechTranslationConfig, certain combinations of language codes cause one or more translations to be silently missing from result.translations. The affected language key simply does not appear in the translations dictionary.
This happens in two scenarios:
- Same base language with different specificity — e.g.,
en combined with en-US. The service appears to internally resolve en to en-US, causing a collision. en + en-GB does not collide, supporting this theory.
- Prefix collision between different languages — e.g.,
fil (Filipino) followed by fi (Finnish). When fil is added first, fi is dropped. Reversed order works fine.
Crucially, no error is raised. The Canceled event does not fire. result.reason is still ResultReason.TranslatedSpeech. The failing language is simply absent from the result dictionary, causing silent data loss.
To Reproduce
- Create a
SpeechTranslationConfig using the v2 universal endpoint.
- Call
add_target_language() with one of the failing combinations listed below.
- Perform speech recognition with any valid audio input.
- Inspect
result.translations — the dropped language key will be missing entirely.
Minimal reproduction script
import os
import azure.cognitiveservices.speech as speechsdk
from azure.cognitiveservices.speech import translation, languageconfig
speech_key = os.environ.get("AZURE_SPEECH_KEY")
speech_region = os.environ.get("AZURE_SPEECH_REGION")
endpoint = f"wss://{speech_region}.stt.speech.microsoft.com/speech/universal/v2"
translation_config = translation.SpeechTranslationConfig(
subscription=speech_key, endpoint=endpoint
)
# ❌ BUG: "en-US" will be silently dropped from results
target_langs = ["en", "en-US"]
# ✅ WORKS: both translations appear
# target_langs = ["en", "en-GB"]
for lang in target_langs:
translation_config.add_target_language(lang)
auto_detect = languageconfig.AutoDetectSourceLanguageConfig(
languages=["ja-JP"]
)
audio_config = speechsdk.audio.AudioConfig(filename="test_audio.wav")
recognizer = translation.TranslationRecognizer(
translation_config=translation_config,
audio_config=audio_config,
auto_detect_source_language_config=auto_detect,
)
result = recognizer.recognize_once()
if result.reason == speechsdk.ResultReason.TranslatedSpeech:
print(f"Recognized: {result.text}")
print(f"Translation keys: {list(result.translations.keys())}")
for lang, text in result.translations.items():
print(f" [{lang}] {text}")
# ⚠️ "en-US" key will be missing here — no error raised
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation = result.cancellation_details
print(f"Canceled: {cancellation.reason}, {cancellation.error_details}")
Test results — English locale combinations
add_target_language() order |
Result |
['en', 'en-GB', 'en-US'] |
❌ en-US missing from translations |
['en', 'en-US', 'en-GB'] |
❌ en-US missing from translations |
['en', 'en-GB'] |
✅ Pass |
['en', 'en-US'] |
❌ en-US missing from translations |
['en-GB', 'en', 'en-US'] |
❌ en missing from translations |
['en-GB', 'en-US', 'en'] |
❌ en missing from translations |
['en-GB', 'en-US'] |
✅ Pass |
['en-GB', 'en'] |
❌ en missing from translations |
['en-US', 'en', 'en-GB'] |
❌ en missing from translations |
['en-US', 'en-GB', 'en'] |
❌ en missing from translations |
['en-US', 'en-GB'] |
✅ Pass |
['en-US', 'en'] |
❌ en missing from translations |
Test results — Prefix collision (fi vs fil)
add_target_language() order |
Result |
['fi', 'fil'] |
✅ Pass |
['fil', 'fi'] |
❌ fi missing from translations |
Observed patterns
en and en-US always collide regardless of order — the later one is dropped. But en + en-GB and en-US + en-GB both pass. This strongly suggests the service resolves en → en-US internally.
- When 3 English variants are combined, the one dropped is always either
en or en-US — whichever is added later relative to the other. en-GB is never affected. This further supports the theory that en is internally resolved to en-US.
fil before fi drops fi, but reversed order works. This points to a prefix-matching issue in internal language routing.
Expected behavior
Either:
- All languages passed to
add_target_language() should produce a translation result in result.translations, or
- The SDK should raise an explicit error (e.g.,
Canceled event with error details) when an unsupported language combination is configured.
Silent omission of translation results with no error is the worst possible failure mode for a translation service.
Version of the Cognitive Services Speech SDK
1.49.0 (azure-cognitiveservices-speech)
Platform, Operating System, and Programming Language
- OS: Linux (Ubuntu-based)
- Hardware: x64
- Programming language: Python 3.8
Additional context
- The documentation for Language Identification states: "Don't include multiple locales of the same language, for example, en-US and en-GB" — but this restriction is documented only for Language Identification candidate languages, not for translation target languages. The Speech Translation how-to guide has no equivalent warning.
- The language support page recommends using language codes (e.g.,
es instead of es-ES) for translation targets, but does not document the collision behavior.
- If this behavior is by design, it should be explicitly documented, and the SDK should emit a warning or error at configuration time rather than silently dropping results.
TranslationRecognizer silently drops target languages when using overlapping or prefix-colliding language codes in
add_target_languageIN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:
TranslatedSpeechresult will reproduce the problem.Describe the bug
When calling
add_target_language()with multiple target languages onSpeechTranslationConfig, certain combinations of language codes cause one or more translations to be silently missing fromresult.translations. The affected language key simply does not appear in the translations dictionary.This happens in two scenarios:
encombined withen-US. The service appears to internally resolveentoen-US, causing a collision.en+en-GBdoes not collide, supporting this theory.fil(Filipino) followed byfi(Finnish). Whenfilis added first,fiis dropped. Reversed order works fine.Crucially, no error is raised. The
Canceledevent does not fire.result.reasonis stillResultReason.TranslatedSpeech. The failing language is simply absent from the result dictionary, causing silent data loss.To Reproduce
SpeechTranslationConfigusing the v2 universal endpoint.add_target_language()with one of the failing combinations listed below.result.translations— the dropped language key will be missing entirely.Minimal reproduction script
Test results — English locale combinations
add_target_language()order['en', 'en-GB', 'en-US']['en', 'en-US', 'en-GB']['en', 'en-GB']['en', 'en-US']['en-GB', 'en', 'en-US']['en-GB', 'en-US', 'en']['en-GB', 'en-US']['en-GB', 'en']['en-US', 'en', 'en-GB']['en-US', 'en-GB', 'en']['en-US', 'en-GB']['en-US', 'en']Test results — Prefix collision (
fivsfil)add_target_language()order['fi', 'fil']['fil', 'fi']Observed patterns
enanden-USalways collide regardless of order — the later one is dropped. Buten+en-GBanden-US+en-GBboth pass. This strongly suggests the service resolvesen→en-USinternally.enoren-US— whichever is added later relative to the other.en-GBis never affected. This further supports the theory thatenis internally resolved toen-US.filbeforefidropsfi, but reversed order works. This points to a prefix-matching issue in internal language routing.Expected behavior
Either:
add_target_language()should produce a translation result inresult.translations, orCanceledevent with error details) when an unsupported language combination is configured.Silent omission of translation results with no error is the worst possible failure mode for a translation service.
Version of the Cognitive Services Speech SDK
1.49.0(azure-cognitiveservices-speech)Platform, Operating System, and Programming Language
Additional context
esinstead ofes-ES) for translation targets, but does not document the collision behavior.