GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84)

im try to convert this ggml to gguf but i got this error .thank you 


python convert_llama_ggml_to_gguf.py --input "D:\nectec\model\llama-2-13b-chat.ggmlv3.q2_K.bin" --output "D:\nectec\model\llama-2-13b-chat.gguf" 
INFO:ggml-to-gguf:* Using config: Namespace(input=WindowsPath('D:/nectec/model/llama-2-13b-chat.ggmlv3.q2_K.bin'), output=WindowsPath('D:/nectec/model/llama-2-13b-chat.gguf'), name=None, desc=None, gqa=8, eps='0', context_length=2048, model_metadata_dir=None, vocab_dir=None, vocabtype='spm,hfft', verbose=False)
WARNING:ggml-to-gguf:=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===
INFO:ggml-to-gguf:* Scanning GGML input file
INFO:ggml-to-gguf:* File format: GGJTv3 with ftype MOSTLY_Q2_K
INFO:ggml-to-gguf:* GGML model hyperparameters: <Hyperparameters: n_vocab=32000, n_embd=5120, n_mult=256, n_head=40, n_layer=40, n_rot=128, n_ff=13824, ftype=MOSTLY_Q2_K>
WARNING:ggml-to-gguf:
=== WARNING === Special tokens may not be converted correctly. Use --model-metadata-dir if possible === WARNING ===

INFO:ggml-to-gguf:- Guessed n_kv_head = 5 based on GQA 8
INFO:ggml-to-gguf:* Preparing to save GGUF file
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:ggml-to-gguf:* Adding model parameters and KV items
INFO:ggml-to-gguf:* Adding 32000 vocab item(s)
INFO:ggml-to-gguf:* Adding 363 tensor(s)
Traceback (most recent call last):
  File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 450, in <module>
    main()
  File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 445, in main
    converter.save()
  File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 238, in save
    self.add_tensors(gguf_writer)
  File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 353, in add_tensors
    gguf_writer.add_tensor(
  File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\gguf_writer.py", line 381, in add_tensor
    self.add_tensor_info(name, shape, tensor.dtype, tensor.nbytes, raw_dtype=raw_dtype)
  File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\gguf_writer.py", line 354, in add_tensor_info
    tensor_shape = quant_shape_from_byte_shape(tensor_shape, raw_dtype)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\quants.py", line 24, in quant_shape_from_byte_shape
    raise ValueError(f"Quantized tensor bytes per row ({shape[-1]}) is not a multiple of {quant_type.name} type size ({type_size})")
ValueError: Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) #11976

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) #11976

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions