Open
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [ *] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- [ *] I am reporting the issue to the correct repository. (Model Garden official or research directory)
- [ *] I checked to make sure that this issue has not been filed already.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/tree/master/official/projects/movinet
2. Describe the bug
A minimal code example will follow the bug description.
- Initialized a pretrained MoViNet A0 Stream model from hub:
https://tfhub.dev/tensorflow/movinet/a0/stream/kinetics-600/classification/ - Initialized a pretrained model MoViNet A0 from checkpoint:
https://storage.googleapis.com/tf_model_garden/vision/movinet/movinet_a0_stream.tar.gz
The logit outputs differ considerably between the two models. I have validated that the model weights in the hub and the checkpoint are the same.
3. Steps to reproduce
Before running the unittest, download and extract the checkpoint:
- wget https://storage.googleapis.com/tf_model_garden/vision/movinet/movinet_a0_stream.tar.gz -O movinet_a0_stream_.tar.gz -q
- tar -xvf movinet_a0_stream_.tar.gz
import unittest
from typing import Tuple, Dict
import tensorflow_hub as hub
import tensorflow as tf
from six.moves import urllib
from io import BytesIO
from PIL import Image
from official.projects.movinet.modeling import movinet
from official.projects.movinet.modeling import movinet_model
import numpy as np
model_id = 'a0'
num_classes = 600
H = W = 172
C = 3
T = 1
bs = 1
dummy_input = tf.random.normal(shape=[bs, T, H, W, 3])
def create_hub_model(model_id) -> Tuple[tf.keras.Model, Dict]:
hub_url = f"https://tfhub.dev/tensorflow/movinet/{model_id}/stream/kinetics-600/classification/"
model_hub = hub.KerasLayer(hub_url)
init_states_fn = model_hub.resolved_object.signatures['init_states']
init_states = init_states_fn(tf.shape(dummy_input))
return model_hub, init_states
def create_local(model_id) -> Tuple[movinet.Movinet, Dict]:
backbone = movinet.Movinet(
model_id=model_id,
causal=True,
conv_type='2plus1d',
se_type='2plus3d',
activation='hard_swish',
gating_activation='hard_sigmoid',
use_positional_encoding=False,
use_external_states=True,
)
backbone.trainable = False
model = movinet_model.MovinetClassifier(
backbone,
num_classes=600,
output_states=True
)
checkpoint_dir = f'movinet_{model_id}_stream'
checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)
checkpoint = tf.train.Checkpoint(model=model)
status = checkpoint.restore(checkpoint_path).expect_partial()
status.assert_existing_objects_matched()
init_states_local = model.init_states(tf.shape(dummy_input))
return model, init_states_local
class MyTestCase(unittest.TestCase):
def test_hub_equal_source(self):
model_hub, states_hub = create_hub_model(model_id)
image_url = 'https://upload.wikimedia.org/wikipedia/commons/8/84/Ski_Famille_-_Family_Ski_Holidays.jpg'
with urllib.request.urlopen(image_url) as f:
image = Image.open(BytesIO(f.read())).resize((H, W))
X = tf.reshape(np.array(image), [1, 1, H, W, 3])
X = tf.cast(X, tf.float32) / 255
y_hub, _ = model_hub({**states_hub, 'image': X})
print(y_hub[0][0:5])
model_local, states_local = create_local(model_id)
y_local, _ = model_local({**states_local, 'image': X})
print(y_local[0][0:5])
tf.debugging.assert_near(y_local, y_hub, atol=1e-3)
if __name__ == '__main__':
unittest.main()
4. Expected behavior
The output logits of the hub model and the checkpoint model should be close. However, they differ considerably.
5. Additional context
Dependencies for the test:
numpy
Pillow==11.1.0
six==1.17.0
tensorflow[and-cuda]==2.18.1
tensorflow_hub==0.16.1
tf_models_official==2.18.00
6. System information
- OS Platform and Distribution - Ubuntu 22.04.5 LTS
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below):=2.18.1
- Python version: 3.10.12
- CUDA/cuDNN version: cuda_12.8.r12.8
- GPU model and memory: NVIDIA GeForce RTX 4090, 24GB