Skip to content

Sharded weights support #2218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

james77777778
Copy link
Collaborator

@james77777778 james77777778 commented Apr 19, 2025

Please see the colab for an example using Gemma2 2B:
https://colab.research.google.com/drive/1iF_Psb6aEV2pkajT-q9ZBjpoO4RX4-Qa?usp=sharing

This PR adds support for sharded weights in KerasPresetSaver and KerasPresetLoader.
The default max_shard_size is set to 10GB.

Kindly ping @divyashreepathihalli @mattdangerw

Note: This feature requires the latest Keras (git+https://github.com/keras-team/keras.git). It is difficult to ensure the backward compatibility.

Related to #2084

@github-actions github-actions bot added the Gemma Gemma model specific issues label Apr 19, 2025
@james77777778 james77777778 added the kokoro:force-run Runs Tests on GPU label Apr 19, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Apr 19, 2025
@james77777778 james77777778 force-pushed the sharded_weights_support branch from 9c92ba4 to bf9966a Compare April 20, 2025 07:08
@james77777778 james77777778 added the kokoro:force-run Runs Tests on GPU label Apr 20, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Apr 20, 2025
@mattdangerw mattdangerw self-requested a review April 20, 2025 22:52
@mattdangerw
Copy link
Member

mattdangerw commented Apr 20, 2025

@james77777778 thanks will take a look! We don't need to be backwards compatible here, the error message you have which an action the user can take is as good as we can do here I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Gemma Gemma model specific issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants