Skip to content

Cannot load checkpoints of ResNet-RS models #10968

Open
@sebp

Description

@sebp

Prerequisites

  • I am using the latest TensorFlow Model Garden release and TensorFlow 2.
  • I am reporting the issue to the correct repository. (Model Garden official or research directory)
  • I checked to make sure that this issue has not been filed already.

1. The entire URL of the file you are using

https://gist.github.com/sebp/b86102744724d866c6639705a45f9a80

2. Describe the bug

Some checkpoints of ResNet-RS models listed here fail to load.

Traceback (most recent call last):
  File "load.py", line 81, in <module>
    fails()
  File "load.py", line 75, in fails
    task.initialize(model)
  File "/home/jovyan/work/venv/lib/python3.8/site-packages/official/vision/tasks/image_classification.py", line 83, in initialize
    status.expect_partial().assert_existing_objects_matched()
  File "/home/jovyan/work/venv/lib/python3.8/site-packages/tensorflow/python/checkpoint/checkpoint.py", line 1068, in assert_existing_objects_matched
    return self.assert_consumed()
  File "/home/jovyan/work/venv/lib/python3.8/site-packages/tensorflow/python/checkpoint/checkpoint.py", line 1051, in assert_consumed
    raise AssertionError(
AssertionError: Some objects had attributes which were not restored: 
    <tf.Variable 'bottleneck_block/conv2d/kernel:0' shape=(1, 1, 64, 256) dtype=float32, numpy=
array([[[[ 0.1922505 ,  0.15006573, -0.06281938, ..., -0.11153815,
          -0.09118956, -0.1233723 ],
         [-0.05223957, -0.16301669, -0.16288121, ..., -0.06663691,
          -0.25606734,  0.02308108],
         [-0.24011418,  0.07856355,  0.07309896, ..., -0.04328562,
           0.15814184,  0.12843679],
         ...,

3. Steps to reproduce

exp_config = tfm.vision.configs.image_classification.image_classification_imagenet_resnetrs()

with open('./resnet-rs-152-i256/imagenet_resnetrs152_i256.yaml') as file:
    override_params = yaml.full_load(file)

exp_config.override(override_params, is_strict=False)
exp_config.task.freeze_backbone = True
exp_config.task.init_checkpoint = "./resnet-rs-152-i256/model.ckpt"
exp_config.task.init_checkpoint_modules = "backbone"
exp_config.task.model.num_classes = 1

task: tfm.core.base_task.Task = tfm.core.task_factory.get_task(exp_config.task)
model = task.build_model()
task.initialize(model)

Full code at https://gist.github.com/sebp/b86102744724d866c6639705a45f9a80

4. Expected behavior

Checkpoint should be loaded without an issue. It works for resnet-rs-101-i192.

5. Additional context

Loading does not fail for all models, the smaller models seem to be okay.

6. System information

  • Ubuntu 22.10
  • Python version: 3.8.13
  • No GPU
  • Tensorflow version
tensorboard                   2.12.0
tensorboard-data-server       0.7.0
tensorboard-plugin-wit        1.8.1
tensorflow                    2.12.0
tensorflow-addons             0.19.0
tensorflow-datasets           4.8.3
tensorflow-estimator          2.12.0
tensorflow-hub                0.13.0
tensorflow-io-gcs-filesystem  0.32.0
tensorflow-metadata           1.12.0
tensorflow-model-optimization 0.7.3
tensorflow-text               2.12.0
tf-models-official            2.12.0
tf-slim                       1.1.0

Metadata

Metadata

Assignees

Labels

models:officialmodels that come under official repositorytype:bugBug in the code

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions