You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- A string, the *model id* of a pretrained model like `THUDM/chatglm-6b`. [TODO]
88
91
- A path to a *directory* clone from repo like `../chatglm-6b`.
89
92
--type TYPE type(`str`, *optional*):
90
93
The pretrain llm model type.
94
+
--tokenizer_path TOKENIZER_PATH
95
+
tokenizer path, defaut is `None` mean using `--path` value.
91
96
--lora_path LORA_PATH
92
97
lora path, defaut is `None` mean not apply lora.
98
+
--gptq_path GPTQ_PATH
99
+
gptq path, defaut is `None` mean not apply gptq.
93
100
--dst_path DST_PATH export onnx/mnn model to path, defaut is `./model`.
101
+
--verbose Whether or not to print verbose.
94
102
--test TEST test model inference with query `TEST`.
95
103
--export EXPORT export model to an onnx/mnn model.
104
+
--onnx_slim Whether or not to use onnx-slim.
96
105
--quant_bit QUANT_BIT
97
106
mnn quant bit, 4 or 8, default is 4.
98
107
--quant_block QUANT_BLOCK
99
-
mnn quant block, default is 0 mean channle-wise.
108
+
mnn quant block, 0 mean channle-wise, default is 128.
100
109
--lm_quant_bit LM_QUANT_BIT
101
110
mnn lm_head quant bit, 4 or 8, default is `quant_bit`.
102
111
--mnnconvert MNNCONVERT
103
112
local mnnconvert path, if invalid, using pymnn.
113
+
--ppl Whether or not to get all logits of input tokens.
114
+
--awq Whether or not to use awq quant.
115
+
--sym Whether or not to using symmetric quant (without zeropoint), defualt is False.
116
+
--seperate_embed For lm and embed shared model, whether or not to sepearte embed to avoid quant, defualt is False, if True, embed weight will be seperate to embeddingbf16.bin.
117
+
--lora_split Whether or not export lora split, defualt is False.
0 commit comments