Skip to content

Binary export/import#467

Open
ntjohnson1 wants to merge 9 commits intosandialabs:mainfrom
ntjohnson1:nick/binary_export_import
Open

Binary export/import#467
ntjohnson1 wants to merge 9 commits intosandialabs:mainfrom
ntjohnson1:nick/binary_export_import

Conversation

@ntjohnson1
Copy link
Collaborator

@ntjohnson1 ntjohnson1 commented Nov 14, 2025

This isn't fully ready to merge since we need cross repo collaboration but I think it is close enough where concrete feedback is useful. There are also a few minor TODOS (adding a version to our headers etc). There are a few open questions around how to test this, and if people like the overall api.

Here is my branch for the matching half in MATLAB: https://gitlab.com/ntjohnson1/tensor_toolbox/-/commits/nick/import_export_compatibility (I figure we should make sure we like it here before opening discussion there)

General usage looks like:
MATLAB export

ktensor_instance = <value>;
export_data('ktensor.mat', 'output_type', 'binary');

Python import

import pyttb as ttb
ktensor_instance = ttb.import_data_mat('ktensor.mat')

OR
Python export

import pyttb as ttb
ktensor_instance = <value>
ttb.export_data_mat(ktensor, 'ktensor.mat')

MATLAB import

ktensor_instance = import_data('ktensor.mat', 'input_type', 'binary');

NOTE: I left import_data_bin and export_data_bin on our side since its not much overhead and if you aren't using matlab it seems a little strange to write to .mat.

Closes #466

Testing options:

  • Check-in an export file from the respective other library in each repo and make a note to re-export and upload if changing format
    • On the pyttb side we can add a lint in CI to enforce this
  • Setup matlab in CI (on either repo) to actually run the compatibility test
    • If MATALB tensortoolbox already had this running python is so much easier, but seems like a lot of effort to setup just for this
  • Other ideas?

Remaining steps:

  • Get initial review on the Python side
  • Get initial review on the MATLAB side
  • Merge one then the other

📚 Documentation preview 📚: https://pyttb--467.org.readthedocs.build/en/467/

@ntjohnson1
Copy link
Collaborator Author

@gqcollins would be curious to see if this helps your use case. AFAICT both of these binary approaches don't use pickle so should be more generally compatible across systems/versions.

@ntjohnson1 ntjohnson1 changed the title Binary export import Binary export/import Nov 14, 2025
@gqcollins
Copy link
Contributor

I'll try it next week and get back to you. Thanks!

@gqcollins
Copy link
Contributor

The binary code does help my use case quite a bit. As I mentioned before, the exports I'm doing take ~80s with the .txt format (using the improved numpy version) and imports take ~20s. The numpy binary version takes ~1.9s for exports and ~3.3s for imports. The .mat version takes ~2.5s for exports and ~1.6s for imports. And I can easily port these files to other systems.

You'll have to decide whether it's worth adding these (or at least one of them) to the repo or not. In any case, I appreciate you offering me the code!

@ntjohnson1
Copy link
Collaborator Author

Great! I'm glad it helped. I think the primary difference you're seeing in time is that mat defaults to compression where numpy doesn't.

I'm pretty confident we'll add one, both, or an alternative. But it might not land for a little bit, I probably need to touch base with Danny. I still need to check how easy it is to interop with the MATLAB ttb.

@ntjohnson1 ntjohnson1 requested a review from dmdunla February 15, 2026 17:40
@ntjohnson1 ntjohnson1 marked this pull request as ready for review February 15, 2026 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should we support a binary serialization format?

3 participants