Skip to content

Conversation

@antimora
Copy link
Contributor

@antimora antimora commented Oct 2, 2025

Problem

The serialize() function accepted Option<HashMap<String, String>> in its public API, which exposed hashbrown as a dependency and locked users to a
specific version (0.15.5). This caused type mismatch errors in downstream crates like burn-store that use different hashbrown versions:

error[E0308]: mismatched types
  --> crates/burn-store/src/safetensors/store.rs:539:65
   |
   | let data = safetensors::serialize(tensors, Some(metadata))?;
   |                                            ^^^^ expected HashMap<String, String>, 
        found a different HashMap<String, String>
   |
note: two different versions of crate hashbrown are being used

Solution

Changed the public API to use BTreeMap<String, String> instead of HashMap:

pub fn serialize<S, V, I>(
    data: I,
    data_info: Option<BTreeMap<String, String>>,  //  Now BTreeMap
) -> Result<Vec<u8>, SafeTensorError>

Why BTreeMap?

  1. No third-party dependency - BTreeMap is part of Rust's standard library (alloc::collections), available in both std and no_std
  2. No version conflicts - Not tied to any external crate version
  3. Still supports no_std - Available in alloc::collections::BTreeMap
  4. Maintains performance - Internal index_map still uses HashMap for O(1) lookups
  5. Added bonus the order is preserved

Changes

  • Public API: All metadata parameters now use BTreeMap
  • Internal implementation: HashMap still used for index_map (performance)
  • Imports: Unified approach using alloc for both std and no_std
  • Removed facade: Simplified from mod lib to direct imports

Compatibility

  • Backward compatible: serialize(tensors, None) still works without type annotations
  • All existing tests pass
  • No_std builds work (cargo build --lib --no-default-features)
  • No clippy warnings

This change decouples safetensors from hashbrown versions while maintaining full functionality and no_std support.

Related to: #651 and tracel-ai/burn#3822

Replaces HashMap with BTreeMap for no-std compatibility instead of hashbrown lib which can break if dep versions are different.
@Narsil
Copy link
Contributor

Narsil commented Oct 7, 2025

No we cannot change the public interface for that.
I'm all up for removing hashbrown whenever HashMap makes it into alloc.

But forcing users to use BTreeMap is not OK.
I'm also relatively against using a trait in the signature.

If the issue is downstream hashbrown version, then we can upgrade our hashbrown version instead, should be much simpler no ? 0.16 was released a month ago....

@antimora
Copy link
Contributor Author

antimora commented Oct 7, 2025

@Narsil

No we cannot change the public interface for that. I'm all up for removing hashbrown whenever HashMap makes it into alloc.

But forcing users to use BTreeMap is not OK. I'm also relatively against using a trait in the signature.

If the issue is downstream hashbrown version, then we can upgrade our hashbrown version instead, should be much simpler no ? 0.16 was released a month ago....

I am afraid we will have the same issue each time hashbrown version is upgraded.

How about we replace hashbrown with BTreeMap conditionally for no_std just like what you have now? I think the user impact is minimal and I can assume there weren't that many consumers of no_std interface because I fixed no_std flag combo recently in #651.

I tried changing the metadata type to Optional<IntoIterator<Item = (S, V)>> which worked flawlessly for any type but it didn't for None because it had to have inner type like this: None::HashMap

@danieldk
Copy link
Member

danieldk commented Oct 7, 2025

How about we replace hashbrown with BTreeMap conditionally for no_std just like what you have now?

But then we'd have to switch back to HashMap when it's added to alloc (to have the two interfaces aligned). Seems easier to track hashbrown releases for the time being?

@antimora
Copy link
Contributor Author

antimora commented Oct 7, 2025

How about we replace hashbrown with BTreeMap conditionally for no_std just like what you have now?

But then we'd have to switch back to HashMap when it's added to alloc (to have the two interfaces aligned). Seems easier to track hashbrown releases for the time being?

I am not aware of any near future plan of HashMap to be part of the core. The standard library's HashMap is based on hashbrown internally but the reason std HashMap is not is because it relies on std random generator.

So I would say hashbrown should be switched before #651 is released. Otherwise, it will keep breaking people's build every time there is a new hashbrown lib release. With this, we are making safetensors interface more stable.

@Narsil
Copy link
Contributor

Narsil commented Oct 7, 2025

Wouldn't vendoring safetensors' hashbrown solve this issue ?

#[no_std]
pub use hashbrown;

Normally, rust doesn't break if multiple versions of a given crate exist in tree...

@antimora
Copy link
Contributor Author

Wouldn't vendoring safetensors' hashbrown solve this issue ?

#[no_std]
pub use hashbrown;

Normally, rust doesn't break if multiple versions of a given crate exist in tree...

I will check if your suggested solution works and will close this PR if does. However, it would have a been a good opportunity to decouple from 3rd party dependency for the API, but it's your call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants