How to convert Llama-2 huggingface checkpoint to the megatron format 

What is the proper way to convert the Llama-2 huggingface checkpoint format to the Megatron? I followed the instructions in the docs/llama2.md, but got the following errors. I don't understand why transformer_engine in core/transformer/custom_layers imports itself as te at line 6,  and in that module, there is no attribute for pytorch.

MODEL_SIZE=7B
TP=1
TOP=/mnt
MEGATRON_DIR=$TOP/Megatron/Megatron-LM
HF_FORMAT_DIR=$TOP/LLaMa/llama_workarea/hf_llama_models/$MODEL_SIZE
MEGATRON_FORMAT_DIR=$TOP/Megatron/workspace.Megatron-LM/weights/$MODEL_SIZE
TOKENIZER_MODEL=$TOP/LLaMa/llama_workarea/hf_llama_models/7B/$MODEL_SIZE/tokenizer.model

export PYTHONPATH="$PWD:$PWD/tools/checkpoint"
echo $PYTHONPATH

python3 tools/checkpoint/util.py \
    --model-type GPT \
    --loader llama2_hf \
    --saver megatron \
    --target-tensor-parallel-size ${TP} \
    --load-dir ${HF_FORMAT_DIR} \
    --save-dir ${MEGATRON_FORMAT_DIR} \
    --tokenizer-model ${TOKENIZER_MODEL}

--
Loaded loader_llama2_hf as the loader.
Loaded saver_megatron as the saver.
Starting saver...
Starting loader...
Zarr-based strategies will not be registered because of missing packages
Zarr-based strategies will not be registered because of missing packages
  File "/home/aae14935wb/Share/Megatron/Megatron-LM/megatron/core/models/gpt/gpt_model.py", line 15, in <module>
    from megatron.core.transformer.transformer_block import TransformerBlock
  File "/home/aae14935wb/Share/Megatron/Megatron-LM/megatron/core/transformer/transformer_block.py", line 13, in <module>
    from megatron.core.transformer.custom_layers.transformer_engine import TENorm
  File "/home/aae14935wb/Share/Megatron/Megatron-LM/megatron/core/transformer/custom_layers/transformer_engine.py", line 71, in <module>
    class TELinear(te.pytorch.Linear):
AttributeError: module 'transformer_engine' has no attribute 'pytorch'


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to convert Llama-2 huggingface checkpoint to the megatron format #658

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

How to convert Llama-2 huggingface checkpoint to the megatron format #658

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions