Skip to content

fix: STT providers hardcode audio/mpeg multipart content-type regardless of file extension #147

@lfnovo

Description

@lfnovo

Problem

Multiple STT providers hardcode audio/mpeg as the multipart content-type when uploading the audio file, regardless of the file's actual format:

Example pattern:

files = {"file": (filename, audio_file, "audio/mpeg")}  # ← always audio/mpeg

If the user passes a .wav, .flac, .m4a, .ogg, or .webm file, the multipart upload will declare it as MP3.

Why it matters

Most servers (OpenAI, Mistral, Groq) inspect the audio content rather than trusting the multipart Content-Type hint, so this hasn't visibly broken anything. But:

  1. It's a lying header — bad form.
  2. Some servers (Azure in some configurations, less-mature OpenAI-compatible servers like local Whisper deployments) DO trust the hint and reject mismatched content.
  3. New STT providers added by following these as templates inherit the bug.

Proposed fix

A small helper in src/esperanto/providers/stt/base.py (or a shared util):

import mimetypes

_DEFAULT_AUDIO_CONTENT_TYPE = "audio/mpeg"

def _guess_audio_content_type(filename: str) -> str:
    \"\"\"Guess multipart Content-Type from filename extension.\"\"\"
    guessed, _ = mimetypes.guess_type(filename)
    if guessed and guessed.startswith("audio/"):
        return guessed
    return _DEFAULT_AUDIO_CONTENT_TYPE

Then each STT provider's transcribe/atranscribe path becomes:

files = {"file": (filename, audio_file, _guess_audio_content_type(filename))}

Files with unknown extensions or non-audio guesses fall back to audio/mpeg, preserving today's behavior.

Acceptance criteria

  • All STT providers in src/esperanto/providers/stt/ use the helper instead of hardcoding audio/mpeg.
  • A .wav file passed to transcribe() results in ("audio/wav", ...) in the multipart upload.
  • An unknown-extension or stream-without-name input still uses audio/mpeg as a safe default.
  • Tests assert the correct Content-Type for at least 2-3 file extensions per affected provider.
  • Validator + ruff + mypy all clean.

Origin

Follow-up from PR #143 (issue #142) — called out in the 'Follow-ups' section of that PR's description. Same pattern was identified in the existing OpenAI STT provider during review.

Metadata

Metadata

Assignees

No one assigned

    Labels

    readyIssue is fully specified and ready for the development team to pick up

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions