Skip to content

v0.5.7

Choose a tag to compare

@Andyyyy64 Andyyyy64 released this 19 May 19:52
· 46 commits to main since this release

What's Changed

  • Detect DGX Spark / NVIDIA GB10 as a shared-memory NVIDIA GPU when NVIDIA reports memory.total as unavailable.
  • Fix whichllm run crashes for large Transformers models by providing an offload_folder.
  • Respect XDG_CACHE_HOME for cache paths, while ignoring relative values per the XDG spec.
  • Treat Apple Silicon as shared memory in fit detection.
  • Inline LiveBench fallback data and speed up benchmark score fetching.

Validation

  • ruff format --check .
  • ruff check .
  • pytest -q -s
  • python -m build
  • twine check dist/*