Hi, thanks for your great projects!
I tried to build a tensorrt of the embedding model (multilingual-e5-large) with following command onvert_model -m intfloat/multilingual-e5-large --backend tensorrt --task embedding --seq-len 16 512 512 --name intfloat-multilingual-e5-large --device cuda --load-external-data --verbose, but I encountered the following error.
Traceback (most recent call last):
File "/usr/local/bin/convert_model", line 8, in <module>
sys.exit(entrypoint())
File "/usr/local/lib/python3.8/dist-packages/transformer_deploy/convert.py", line 357, in entrypoint
main(commands=args)
File "/usr/local/lib/python3.8/dist-packages/transformer_deploy/convert.py", line 179, in main
convert_to_onnx(
File "/usr/local/lib/python3.8/dist-packages/transformer_deploy/backends/pytorch_utils.py", line 158, in convert_to_onnx
onnx.save(onnx_model, output_path)
File "/usr/local/lib/python3.8/dist-packages/onnx/__init__.py", line 203, in save_model
s = _serialize(proto)
File "/usr/local/lib/python3.8/dist-packages/onnx/__init__.py", line 71, in _serialize
result = proto.SerializeToString()
ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2235540927
In transformer-deploy, if proto size is exceeded 2GB, save_as_exceed_data should be true.
|
save_external_data: bool = to_save.ByteSize() > 2 * 1024**3 |
|
filename = Path(model_path).name |
|
onnx.save_model( |
|
proto=to_save, |
|
f=model_path, |
|
save_as_external_data=save_external_data, |
|
all_tensors_to_one_file=True, |
|
location=filename + ".data", |
|
) |
According to onnx API docs, we should use onnx.checker.check_model.
import onnx
onnx.checker.check_model("path/to/the/model.onnx")
# onnx.checker.check_model(loaded_onnx_model) will fail if given >2GB model
The other idea is if load_external_data is true, save_as_external_data should be true.
In the onnx code, they setMAXIMUM_PROTOBUF = 2000000000. I do not understand why this error occurred.
https://github.com/onnx/onnx/blob/238f2b9a41b28e6db0086c8a1be655d517c94dd1/onnx/checker.py#L45-L47
In the onnx, they use sys.getsizeof instead of ByteSize. This is a difference between transformer-deploy and onnx.
https://github.com/onnx/onnx/blob/238f2b9a41b28e6db0086c8a1be655d517c94dd1/onnx/checker.py#L175-L178
Hi, thanks for your great projects!
I tried to build a
tensorrtof the embedding model (multilingual-e5-large) with following commandonvert_model -m intfloat/multilingual-e5-large --backend tensorrt --task embedding --seq-len 16 512 512 --name intfloat-multilingual-e5-large --device cuda --load-external-data --verbose, but I encountered the following error.In transformer-deploy, if proto size is exceeded 2GB, save_as_exceed_data should be true.
transformer-deploy/src/transformer_deploy/backends/onnx_utils.py
Lines 40 to 48 in 6b88e24
According to onnx API docs, we should use
onnx.checker.check_model.The other idea is if load_external_data is true, save_as_external_data should be true.
In the onnx code, they set
MAXIMUM_PROTOBUF = 2000000000. I do not understand why this error occurred.https://github.com/onnx/onnx/blob/238f2b9a41b28e6db0086c8a1be655d517c94dd1/onnx/checker.py#L45-L47
In the onnx, they use
sys.getsizeofinstead of ByteSize. This is a difference between transformer-deploy and onnx.https://github.com/onnx/onnx/blob/238f2b9a41b28e6db0086c8a1be655d517c94dd1/onnx/checker.py#L175-L178