Skip to content

[Distributed] Did not find tokenizer at {tokenizer_path} #1146

Open
@kwen2501

Description

🐛 Describe the bug

torchrun --nproc-per-node 8 dist_run.py
known configs: ['13B', '30B', '34B', '70B', '7B', 'CodeLlama-7b-Python-hf', 'Mistral-7B', 'stories110M', 'stories15M', 'stories42M', 'Meta-Llama-3-70B', 'Meta-Llama-3-8B', 'Meta-Llama-3.1-70B-Tune', 'Meta-Llama-3.1-70B', 'Meta-Llama-3.1-8B-Tune', 'Meta-Llama-3.1-8B']
09-14 15:41:32.092 - dist_run:137 - Chat Model Config: TransformerArgs(block_size=2048, vocab_size=32000, n_layers=32, n_heads=32, dim=4096, hidden_dim=11008, n_local_heads=32, head_dim=128, rope_base=10000, norm_eps=1e-05, multiple_of=256, ffn_dim_multiplier=None, use_tiktoken=False, max_seq_length=8192, rope_scaling=None, n_stages=1, stage_idx=0)
[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/kw2501/local/torchchat/dist_run.py", line 277, in <module>
[rank0]:     main()
[rank0]:   File "/home/kw2501/local/torchchat/dist_run.py", line 139, in main
[rank0]:     tokenizer = _build_chat_tokenizer()
[rank0]:   File "/home/kw2501/local/torchchat/dist_run.py", line 94, in _build_chat_tokenizer
[rank0]:     tokenizer_args = TokenizerArgs.from_args(args)
[rank0]:   File "/home/kw2501/local/torchchat/torchchat/cli/builder.py", line 269, in from_args
[rank0]:     raise RuntimeError(f"did not find tokenizer at {tokenizer_path}")
[rank0]: RuntimeError: did not find tokenizer at /home/kw2501/.torchchat/model-cache/meta-llama/Meta-Llama-3-8B-Instruct/tokenizer.model

Cc: @lessw2020

Versions

main branch

Metadata

Assignees

No one assigned

    Labels

    DistributedIssues related to all things distributed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions