Description
Describe the bug
I'm seeing a UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 1024: ordinal not in range(128)
. The code in tf_utils.py (https://github.com/onnx/tensorflow-onnx/blob/main/tf2onnx/tf_utils.py#L57) seems to mark this as expected, but the fallback to np.vectorize(lambda x: x.decode('UTF-8'))
also seems to fail with a similar error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 1024: invalid start byte
Urgency
N/A
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 18.04*): GCP VM
- TensorFlow Version: 2.9.00
- Python version: 3.7.12
- ONNX version (if applicable, e.g. 1.11*): onnx-1.14.1 (installed via pip install git+https://github.com/onnx/tensorflow-onnx)
- ONNXRuntime version (if applicable, e.g. 1.11*): onnxruntime-1.14.1
To Reproduce
The model is a custom DCN v2 model built using libraries from the tensorflow ecosystem. This includes TFRS (recommender systems), TFR (ranking), TF Text, TF IO, and TF Transform. The model is saved using tf.saved_model.save(..)
.
Additional context
- I found that a whole set of ops in the model don't seem to be present in the supported list of ops. But based on the troubleshooting guide, the error I'm seeing here looks different from the one mentioned in the guide. Is it possible that the decode errors are due to the unsupported ops?
- Missing ops:
- Bucketize
- AssignVariableOp
- InitializeTableFromTextFileV2
- LookupTableImportV2
- MergeV2Checkpoints
- ReadVariableOp
- ResourceGather
- RestoreV2
- SaveV2
- ShardedFilename
- StatefulPartitionedCall
- StaticRegexFullMatch
- VarHandleOp
- TFText>WhitespaceTokenizeWithOffsetsV2
- VarIsInitializedOp
- Supported via ai.onnx.contrib:
- StaticRegexReplace
- StringJoin
- StringSplitV2
- StringToHashBucketFast
- Not sure if this is relevant, I previously found that the conversion doesn't proceed without having to explicitly import
tensorflow_text
. In order to do this, I have a custom script (shared below) which invokestf2onnx.convert.main()
.
import tensorflow as tf
import tensorflow_text as tf_text
import tensorflow_transform as tft
import tf2onnx.convert
print("Done importing custom tf modules...")
print("Invoking tf2onnx.convert.main()...")
tf2onnx.convert.main()
I've tried switching my numpy version to 1.20 as mentioned in one of the github issues, but this doesn't seem to work either. Would appreciate your help with this!