Skip to content

v0.5.0

Latest

Choose a tag to compare

@alvarobartt alvarobartt released this 13 Mar 14:00
ad02b07

💥 GGUF support!

Thanks for both @diegovelilla and @vm7608, hf-mem now supports GGUF!

uvx hf-mem --model-id TheBloke/deepseek-llm-7B-chat-GGUF --gguf-file deepseek-llm-7b-chat.Q2_K.gguf --experimental

🐍 Use via Python!

Now you can use hf-mem programmatically with Python as:

from hf_mem import run

run(model_id="MiniMaxAI/MiniMax-M2", experimental=True)
# Result(model_id='MiniMaxAI/MiniMax-M2', revision='main', filename=None, memory=230121630720, kv_cache=24964497408, total_memory=255086128128, details=False)

🐛 Fixes

uvx hf-mem --model-id Qwen/Qwen3.5-397B-A17B --experimental --kv-cache-dtype fp8

v0.4.4

v0.5.0

🚨 Deprecated!

  • --ignore-table-width is deprecated and won't have any effect, in favour of always resizing the table to fit the content regardless of how wide it is.

What's Changed

  • Fix display from GB to GiB for consistency by @vrdn-23 in #34
  • Add version display on table and JSON output by @alvarobartt in #36
  • Add support for GGUF files + KV cache from GGUF metadata by @diegovelilla in #25
  • (fix) Fetch referenced config on models that require it by @Napuh in #38
  • Release version v0.5.0 + add --hf-token, use hf-mem as lib, etc. by @alvarobartt in #39

New Contributors

Full Changelog: 0.4.4...0.5.0