Skip to content

GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) #11976

Open
@chokoon123

Description

im try to convert this ggml to gguf but i got this error .thank you

python convert_llama_ggml_to_gguf.py --input "D:\nectec\model\llama-2-13b-chat.ggmlv3.q2_K.bin" --output "D:\nectec\model\llama-2-13b-chat.gguf"
INFO:ggml-to-gguf:* Using config: Namespace(input=WindowsPath('D:/nectec/model/llama-2-13b-chat.ggmlv3.q2_K.bin'), output=WindowsPath('D:/nectec/model/llama-2-13b-chat.gguf'), name=None, desc=None, gqa=8, eps='0', context_length=2048, model_metadata_dir=None, vocab_dir=None, vocabtype='spm,hfft', verbose=False)
WARNING:ggml-to-gguf:=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===
INFO:ggml-to-gguf:* Scanning GGML input file
INFO:ggml-to-gguf:* File format: GGJTv3 with ftype MOSTLY_Q2_K
INFO:ggml-to-gguf:* GGML model hyperparameters: <Hyperparameters: n_vocab=32000, n_embd=5120, n_mult=256, n_head=40, n_layer=40, n_rot=128, n_ff=13824, ftype=MOSTLY_Q2_K>
WARNING:ggml-to-gguf:
=== WARNING === Special tokens may not be converted correctly. Use --model-metadata-dir if possible === WARNING ===

INFO:ggml-to-gguf:- Guessed n_kv_head = 5 based on GQA 8
INFO:ggml-to-gguf:* Preparing to save GGUF file
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:ggml-to-gguf:* Adding model parameters and KV items
INFO:ggml-to-gguf:* Adding 32000 vocab item(s)
INFO:ggml-to-gguf:* Adding 363 tensor(s)
Traceback (most recent call last):
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 450, in
main()
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 445, in main
converter.save()
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 238, in save
self.add_tensors(gguf_writer)
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 353, in add_tensors
gguf_writer.add_tensor(
File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\gguf_writer.py", line 381, in add_tensor
self.add_tensor_info(name, shape, tensor.dtype, tensor.nbytes, raw_dtype=raw_dtype)
File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\gguf_writer.py", line 354, in add_tensor_info
tensor_shape = quant_shape_from_byte_shape(tensor_shape, raw_dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\quants.py", line 24, in quant_shape_from_byte_shape
raise ValueError(f"Quantized tensor bytes per row ({shape[-1]}) is not a multiple of {quant_type.name} type size ({type_size})")
ValueError: Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84)

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions