Release v0.5.0 · alvarobartt/hf-mem

💥 GGUF support!

Thanks for both @diegovelilla and @vm7608, hf-mem now supports GGUF!

uvx hf-mem --model-id TheBloke/deepseek-llm-7B-chat-GGUF --gguf-file deepseek-llm-7b-chat.Q2_K.gguf --experimental

🐍 Use via Python!

Now you can use hf-mem programmatically with Python as:

from hf_mem import run

run(model_id="MiniMaxAI/MiniMax-M2", experimental=True)
# Result(model_id='MiniMaxAI/MiniMax-M2', revision='main', filename=None, memory=230121630720, kv_cache=24964497408, total_memory=255086128128, details=False)

🐛 Fixes

The table now displays the memory requirements as GiB instead of GB to be more accurate, thanks to @vrdn-23!
The KV cache estimations on Safetensors now use a more accurate formula to handle properly the full and sliding attention, rather than always assuming full attention + use the head_dim if specified instead of calculating it, thanks to https://huggingface.co/YouJiacheng in https://huggingface.co/Qwen/Qwen3.5-397B-A17B/discussions/20#69a5bf82a2b3b0f27e8eacef

uvx hf-mem --model-id Qwen/Qwen3.5-397B-A17B --experimental --kv-cache-dtype fp8

v0.4.4

v0.5.0

🚨 Deprecated!

--ignore-table-width is deprecated and won't have any effect, in favour of always resizing the table to fit the content regardless of how wide it is.

What's Changed

Fix display from GB to GiB for consistency by @vrdn-23 in #34
Add version display on table and JSON output by @alvarobartt in #36
Add support for GGUF files + KV cache from GGUF metadata by @diegovelilla in #25
(fix) Fetch referenced config on models that require it by @Napuh in #38
Release version v0.5.0 + add --hf-token, use hf-mem as lib, etc. by @alvarobartt in #39

New Contributors

@vrdn-23 made their first contribution in #34
@diegovelilla made their first contribution in #25

Full Changelog: 0.4.4...0.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

💥 GGUF support!

🐍 Use via Python!

🐛 Fixes

v0.4.4

v0.5.0

🚨 Deprecated!

What's Changed

New Contributors

Contributors

Uh oh!