Skip to content

convert oidv2-resnet_v1_101.ckpt model to onnx #1995

Open
@Etty-Cohen

Description

@Etty-Cohen

Describe the bug
I am trying to convert the model from Google oidv2-resnet_v1_101.ckpt which is trained in TensorFlow to the ONNX model.
I tried 2 ways:
1. Direct conversion to ONNX -
python3 -m tf2onnx.convert --checkpoint oidv2-resnet_v1_101.ckpt.meta --output model.onnx --inputs input_values: 0 --outputs multi_predictions: 0
I encountered an error: UnicodeDecodeError
I solved it by modifying the code of the tf_utils.py file inside the tf2onnx package as described here: but then there are still other errors:

python3 -m tf2onnx.convert --checkpoint oidv2-resnet_v1_101.ckpt.meta --output model.onnx --inputs input_values:0 --outputs multi_predictions:0
/usr/lib/python3.8/runpy.py:127: RuntimeWarning: 'tf2onnx.convert' found in sys.modules after import of package 'tf2onnx', but prior to execution of 'tf2onnx.convert'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
INFO:tensorflow:Restoring parameters from oidv2-resnet_v1_101.ckpt
2022-07-11 15:01:15,320 - INFO - Restoring parameters from oidv2-resnet_v1_101.ckpt
WARNING:tensorflow:From /home/etty/.local/lib/python3.8/site-packages/tf2onnx/tf_loader.py:305: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
2022-07-11 15:01:16,486 - WARNING - From /home/etty/.local/lib/python3.8/site-packages/tf2onnx/tf_loader.py:305: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
2022-07-11 15:01:16,486 - WARNING - From /usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
2022-07-11 15:01:18,935 - INFO - Using tensorflow=2.8.0, onnx=1.6.0, tf2onnx=1.11.1/1915fb
2022-07-11 15:01:18,935 - INFO - Using opset <onnx, 11>
2022-07-11 15:01:24,901 - WARNING - Shapes of Merge map/while/decoded_image/cond_jpeg/cond_png/cond_gif/Merge have different ranks: 3, 4
2022-07-11 15:01:27,105 - WARNING - Cannot infer shape for map/while/Squeeze: map/while/Squeeze:0
2022-07-11 15:01:27,105 - WARNING - Cannot infer shape for map/while/ExpandDims: map/while/ExpandDims:0
2022-07-11 15:01:27,105 - WARNING - Cannot infer shape for map/while/Squeeze_1: map/while/Squeeze_1:0
2022-07-11 15:01:27,105 - WARNING - Cannot infer shape for SpatialSqueeze: SpatialSqueeze:0
2022-07-11 15:01:27,105 - WARNING - Cannot infer shape for multi_predictions: multi_predictions:0
2022-07-11 15:01:31,210 - INFO - Computed 0 values for constant folding
2022-07-11 15:01:37,205 - ERROR - rewriter <function rewrite_cond at 0x7fe6427165e0>: exception the rank of outputs map/while/decoded_image/cond_jpeg/cond_png/cond_gif/DecodeGif:0 and map/while/decoded_image/cond_jpeg/cond_png/cond_gif/DecodeBmp:0 mismatch: 4, 3
2022-07-11 15:01:37,218 - INFO - ['Traceback (most recent call last):\n', '  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/tfonnx.py", line 352, in run_rewriters\n    ops = func(g, g.get_nodes())\n', '  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/rewriter/cond_rewriter.py", line 332, in rewrite_cond\n    return CondRewriter(g).rewrite()\n', '  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/rewriter/cond_rewriter.py", line 54, in rewrite\n    return self.run()\n', '  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/rewriter/cond_rewriter.py", line 92, in run\n    if_node = self._create_if_node(cond_context)\n', '  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/rewriter/cond_rewriter.py", line 139, in _create_if_node\n    output_shapes, output_dtypes = self._get_output_shape_dtype(cond_context)\n', '  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/rewriter/cond_rewriter.py", line 117, in _get_output_shape_dtype\n    raise RuntimeError(\n', 'RuntimeError: the rank of outputs map/while/decoded_image/cond_jpeg/cond_png/cond_gif/DecodeGif:0 and map/while/decoded_image/cond_jpeg/cond_png/cond_gif/DecodeBmp:0 mismatch: 4, 3\n']
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/convert.py", line 696, in <module>
    main()
  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/convert.py", line 266, in main
    model_proto, _ = _convert_common(
  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/convert.py", line 161, in _convert_common
    g = process_tf_graph(tf_graph, const_node_values=const_node_values,
  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/tfonnx.py", line 439, in process_tf_graph
    g = process_graphs(main_g, subgraphs, custom_op_handlers, inputs_as_nchw, continue_on_error, custom_rewriter,
  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/tfonnx.py", line 491, in process_graphs
    g = process_parsed_graph(main_g, custom_op_handlers, inputs_as_nchw, continue_on_error, custom_rewriter,
  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/tfonnx.py", line 594, in process_parsed_graph
    topological_sort(g, continue_on_error)
  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/tfonnx.py", line 336, in topological_sort
    g.topological_sort(ops)
  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/graph.py", line 1055, in topological_sort
    utils.make_sure(j is not None, "Cannot find node with output %r in graph %r", inp, self.graph_name)
  File "/home/etty/.local/lib/python3.8/site-packages/tf2onnx/utils.py", line 264, in make_sure
    raise ValueError("make_sure failure: " + error_msg % args)
ValueError: make_sure failure: Cannot find node with output 'map/while/decoded_image/cond_jpeg/cond_png/cond_gif/Merge:0' in graph 'tf2onnx__3'

2. Convert the model from its current form - checkpoint to a model from pb format (saved model).
I converted this way:

import tensorflow.compat.v1 as tf

checkpoint_path = 'oidv2-resnet_v1_101.ckpt'
g = tf.Graph()
with g.as_default():
    tf_config = tf.ConfigProto()
    tf_config.gpu_options.allow_growth = True
    tf_config.gpu_options.per_process_gpu_memory_fraction = 0.25
    with tf.Session(config=tf_config) as sess:
        saver = tf.train.import_meta_graph(checkpoint_path + '.meta')
        print(type(saver))
        model = saver.restore(sess, checkpoint_path)

        # Export checkpoint to SavedModel
        saved_model_path = "oidv2-resnet_v1_101_sm"
        builder = tf.saved_model.builder.SavedModelBuilder(saved_model_path)
        builder.add_meta_graph_and_variables(sess,
                                         [tf.saved_model.TRAINING, tf.saved_model.SERVING],
                                         main_op=tf.tables_initializer(),
                                         strip_default_attrs=True)
        builder.save() 

Then try a standard conversion from TF to ONNX and get this error:

RuntimeError: MetaGraphDef associated with tags 'serve' could not be found in SavedModel, with available tags '[{'serve', 'train'}]'. To inspect available tag-sets in the SavedModel, please use the SavedModel CLI: saved_model_cli.
I checked: saved_model_cli show --dir oidv2-resnet_v1_101_sm
and got:

`The given SavedModel contains the following tag-sets:
'serve, train'

Urgency
none

Additional context
I saw that answer:
#920
and dont sure if got it right- maybe there is no way to make that conversion??

System information

  • OS Platform and Distribution Linux Ubuntu 18.04
  • Tensorflow Version: 2.8.0
  • Python version: 3.8.10 64 bit

Metadata

Metadata

Assignees

No one assigned

    Labels

    potential bugError in codebase may cause a bug, but no concrete examples observed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions