Skip to content

v0.2.5

Choose a tag to compare

@jundot jundot released this 07 Mar 18:31
· 445 commits to main since this release

What's New

Features

  • Presence penalty & min_p sampling: added presence_penalty and min_p as new sampling parameters for finer control over generation behavior. configurable per-model from the admin panel's model settings. (#94)

Bug Fixes

  • Metal crash on concurrent add_request: serialized add_request calls through the MLX executor to prevent Metal GPU crashes under concurrent request submission. (#95)
  • HuggingFace model search broken: removed deprecated direction parameter from huggingface_hub.list_models() that was silently breaking model search results.

Dependencies

  • mlx-vlm updated to 348466f: adds support for new VLM model types (MiniCPM-O, Phi-4-reasoning-vision, Phi-4-Multimodal) and includes various bug fixes. oMLX's model discovery and vision input pipeline updated accordingly.

Thanks to @rsnow for reporting the Metal crash issue!