Skip to content

Integrate safetensors for model serialization #2532

Open
@strickvl

Description

Open Source Contributors Welcomed!

Please comment below if you would like to work on this issue!

Contact Details [Optional]

[email protected]

What happened?

ZenML currently uses Python's pickle module (via cloudpickle library) for model serialization and materialization. However, the safetensors library is fast becoming a standard for storing tensors and model weights, offering a reasonable alternative to pickle. Integrating safetensors into ZenML would provide users with a more efficient and secure option for model serialization.

Task Description

Implement support for using safetensors instead of pickle for model materialization in ZenML. The task involves the following:

  1. Modify the base materializers to use safetensors for model serialization.
  2. Update the integration-specific materializers (located in src/zenml/integrations) to utilize safetensors where appropriate.
  3. Ensure backward compatibility with existing pickle-based serialized models.
  4. Update relevant documentation and examples to reflect the new safetensors option.

Expected Outcome

  • ZenML will support model serialization using safetensors, providing a faster and more secure alternative to pickle.
  • Users will have the option to choose between pickle and safetensors for model materialization.
  • The integration of safetensors will be seamless, maintaining compatibility with existing ZenML workflows.
  • Documentation and examples will be updated to guide users on how to utilize the safetensors option effectively.

Steps to Implement

  1. Familiarize yourself with the safetensors library and its usage for model serialization.
  2. Modify the base materializers in ZenML to include support for safetensors serialization.
  3. Identify integration-specific materializers in src/zenml/integrations that would benefit from safetensors and update them accordingly.
  4. Implement backward compatibility measures to ensure existing pickle-based serialized models can still be loaded.
  5. Update relevant documentation, including the user guide and API reference, to explain the new safetensors option and provide examples of its usage.
  6. Write unit / integration tests to verify the functionality of safetensors serialization in various scenarios.
  7. Submit a pull request with the implemented changes for review.

Additional Context

Integrating safetensors into ZenML aligns with the project's goal of providing efficient and secure tools for machine learning workflows. By offering an alternative to pickle, ZenML empowers users with more options for model serialization, catering to their specific needs and preferences.

Code of Conduct

  • I agree to follow this project's Code of Conduct

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions