Skip to content

feat: add per-request model selection#11

Open
Odrec wants to merge 3 commits intonamastexlabs:mainfrom
Odrec:feat/per-request-model-selection
Open

feat: add per-request model selection#11
Odrec wants to merge 3 commits intonamastexlabs:mainfrom
Odrec:feat/per-request-model-selection

Conversation

@Odrec
Copy link
Copy Markdown
Contributor

@Odrec Odrec commented Feb 10, 2026

Summary

Allow API clients to specify which Whisper model to use per request via the new model form parameter on POST /v1/transcript.

Problem

Currently, the model is set server-wide via MURMURAI_MODEL (defaults to large-v3-turbo). Clients have no way to request a different model per transcription job. This is limiting for applications that want to offer model selection to their users (e.g., faster/lighter models for quick previews, larger models for accuracy).

Solution

Add an optional model parameter to the transcript submission endpoint:

  • When omitted or empty string: uses the server default from settings (fully backward compatible)
  • When specified (e.g., base, small, large-v2): ModelManager loads and caches the requested model separately

The existing cache system (up to 3 custom model configs) naturally handles model variants. The cache key now includes the model name, so base with default options and large-v3-turbo with default options are cached independently.

Changes

File Change
server.py Add model Form parameter to submit_transcript(), sanitize empty string to None, pass to TranscribeOptions
transcriber.py Add `model: str
model_manager.py Accept model_name param in get_model() and _get_custom_model(), include model in cache key hash, use requested model name in load_model() call
test_server.py Add 2 tests: model param accepted, empty string falls back to default

API Usage

# Use server default model
curl -X POST /v1/transcript -F file=@audio.mp3

# Use specific model
curl -X POST /v1/transcript -F file=@audio.mp3 -F model=base
curl -X POST /v1/transcript -F file=@audio.mp3 -F model=large-v2

Allow API clients to specify which Whisper model to use per request
via the new 'model' form parameter on POST /v1/transcript.

When omitted (or empty string), the server default from settings is
used (backward compatible). When specified, ModelManager loads and
caches the requested model separately from the default.

Changes:
- server.py: Add 'model' Form parameter to submit_transcript()
- transcriber.py: Add 'model' field to TranscribeOptions dataclass
- model_manager.py: Accept model_name in get_model() and
  _get_custom_model(), include model in cache key hash
- test_server.py: Add tests for model param and empty-string fallback

Supported model names: tiny, base, small, medium, large-v1, large-v2,
large-v3, large-v3-turbo (any name accepted by faster-whisper).
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @Odrec, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the flexibility of the transcription service by enabling per-request Whisper model selection. Previously, the model was a server-wide setting, limiting client applications that needed to offer diverse model choices (e.g., faster models for previews, more accurate ones for final output). The changes introduce a new API parameter and update the model management logic to dynamically load and cache different models based on client requests, while maintaining full backward compatibility.

Highlights

  • Per-Request Model Selection: Introduced a new 'model' form parameter for 'POST /v1/transcript', allowing API clients to specify the Whisper model to use for each transcription request.
  • Enhanced Model Caching: Enhanced the 'ModelManager' to incorporate the 'model_name' into the cache key, enabling independent caching of different model variants with custom options.
  • Backward Compatibility: Ensured full backward compatibility: if the 'model' parameter is omitted or an empty string, the server's default Whisper model is used.
Changelog
  • src/murmurai_server/model_manager.py
    • Modified '_hash_options' to include 'model_name' in the cache key.
    • Updated 'get_model' to accept an optional 'model_name' parameter.
    • Adjusted the fast-path logic in 'get_model' to consider the 'effective_model'.
    • Modified '_get_custom_model' to accept and use the 'model_name' for loading.
    • Updated logging messages to reflect the specific model being loaded or cached.
  • src/murmurai_server/server.py
    • Added an optional 'model' form parameter to the 'submit_transcript' endpoint.
    • Implemented logic to sanitize an empty 'model' string to 'None'.
    • Passed the resolved 'model' parameter to the 'TranscribeOptions' object.
  • src/murmurai_server/transcriber.py
    • Added a 'model' field to the 'TranscribeOptions' class.
    • Updated the 'transcribe' method to pass the 'options.model' to 'ModelManager.get_model'.
  • tests/test_server.py
    • Added 'test_submit_transcript_with_model' to verify per-request model selection.
    • Added 'test_submit_transcript_with_empty_model_uses_default' to confirm fallback to server default.
Activity
  • No specific activity (comments, reviews, progress updates) was provided in the context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature for per-request model selection, correctly propagating the new model parameter from the API endpoint down to the model loading logic. However, a critical path traversal vulnerability was identified: the model parameter on the /v1/transcript endpoint is not validated, allowing an attacker to specify arbitrary file paths, which could lead to unauthorized file access and potentially remote code execution. Remediation details for implementing an allow-list for model names are provided in a code comment. Additionally, while the caching changes for the model name in the hash key are correctly implemented, the new tests should be enhanced to verify that the correct model is being used by the background task, not just that the API endpoint accepts the parameter.

Comment thread src/murmurai_server/model_manager.py
Comment thread tests/test_server.py
Odrec added 2 commits February 10, 2026 15:12
- Add ALLOWED_MODELS allow-list to prevent path traversal attacks
  via the model parameter (security-critical fix)
- Validate model name against allow-list before processing, return
  400 with clear error message for invalid model names
- Improve tests: mock process_transcription to verify model value
  is correctly propagated to TranscribeOptions
- Add test for path traversal rejection (../../etc/passwd)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant