This repository was archived by the owner on Jan 23, 2026. It is now read-only.
Llama 3.2 1B Instruct on TPU v4, bumping transformers to 4.45.2#109
Open
artus-LYTiQ wants to merge 8 commits into
Open
Llama 3.2 1B Instruct on TPU v4, bumping transformers to 4.45.2#109artus-LYTiQ wants to merge 8 commits into
artus-LYTiQ wants to merge 8 commits into
Conversation
Note that the two new test are just manual test, not pytests. The rope implementation is unvalidated - we just pray and are happy that it still generates tokens XD
tengomucho
reviewed
Oct 21, 2024
Collaborator
tengomucho
left a comment
There was a problem hiding this comment.
We will wait for the other contribution to be merged before merging this one, but thank you for contributing! Can you confirm the models you have tested with your changes?
| next_token_id = torch.argmax(next_logits, dim=-1)[:, None].int() | ||
| return next_token_id | ||
|
|
||
| def _test_distributed_model_generation(model_id, max_new_tokens=20): |
Collaborator
There was a problem hiding this comment.
for tests, please create one test similar to tests/test_distributed_model.py (or modify the existing one). To launch it, you can use pytest: python -m pytest -sv /path/to/test_mytest.py::test_my_test_function.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added llama3 rope_type implementation and changed default model to Llama 3.2 1B Instruct.
Create an adaptation of the HF transformer's llama3 rope_type implementation in modeling_llama.py.
Updated the dependency to the current transformer library version 4.45.2.
Added more logging to distributed_model.py as the TPU v4-8 vms love to hang at random places when running this code.
Fixes #80
Before submitting