Movinet hub and source output differ

# Prerequisites

Please answer the following questions for yourself before submitting an issue.

- [ *] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- [ *] I am reporting the issue to the correct repository. (Model Garden official or research directory)
- [ *] I checked to make sure that this issue has not been filed already.

## 1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/official/projects/movinet

## 2. Describe the bug

A minimal code example will follow the bug description. 

1. Initialized a pretrained MoViNet A0 Stream model from hub:
https://tfhub.dev/tensorflow/movinet/a0/stream/kinetics-600/classification/
2. Initialized a pretrained model MoViNet A0 from checkpoint:
https://storage.googleapis.com/tf_model_garden/vision/movinet/movinet_a0_stream.tar.gz

The logit outputs differ considerably between the two models. I have validated that the model weights in the hub and the checkpoint are the same.


## 3. Steps to reproduce
Before running the unittest, download and extract the checkpoint:

1. wget https://storage.googleapis.com/tf_model_garden/vision/movinet/movinet_a0_stream.tar.gz -O movinet_a0_stream_.tar.gz -q
2. tar -xvf movinet_a0_stream_.tar.gz


```
import unittest
from typing import Tuple, Dict
import tensorflow_hub as hub
import tensorflow as tf
from six.moves import urllib
from io import BytesIO
from PIL import Image
from official.projects.movinet.modeling import movinet
from official.projects.movinet.modeling import movinet_model
import numpy as np

model_id = 'a0'
num_classes = 600
H = W = 172
C = 3
T = 1
bs = 1
dummy_input = tf.random.normal(shape=[bs, T, H, W, 3])


def create_hub_model(model_id) -> Tuple[tf.keras.Model, Dict]:
    hub_url = f"https://tfhub.dev/tensorflow/movinet/{model_id}/stream/kinetics-600/classification/"
    model_hub = hub.KerasLayer(hub_url)
    init_states_fn = model_hub.resolved_object.signatures['init_states']
    init_states = init_states_fn(tf.shape(dummy_input))
    return model_hub, init_states


def create_local(model_id) -> Tuple[movinet.Movinet, Dict]:
    backbone = movinet.Movinet(
        model_id=model_id,
        causal=True,
        conv_type='2plus1d',
        se_type='2plus3d',
        activation='hard_swish',
        gating_activation='hard_sigmoid',
        use_positional_encoding=False,
        use_external_states=True,
    )
    backbone.trainable = False
    model = movinet_model.MovinetClassifier(
        backbone,
        num_classes=600,
        output_states=True
    )
    checkpoint_dir = f'movinet_{model_id}_stream'
    checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)
    checkpoint = tf.train.Checkpoint(model=model)
    status = checkpoint.restore(checkpoint_path).expect_partial()
    status.assert_existing_objects_matched()
    init_states_local = model.init_states(tf.shape(dummy_input))
    return model, init_states_local


class MyTestCase(unittest.TestCase):

    def test_hub_equal_source(self):
        model_hub, states_hub = create_hub_model(model_id)
        image_url = 'https://upload.wikimedia.org/wikipedia/commons/8/84/Ski_Famille_-_Family_Ski_Holidays.jpg'
        with urllib.request.urlopen(image_url) as f:
            image = Image.open(BytesIO(f.read())).resize((H, W))
        X = tf.reshape(np.array(image), [1, 1, H, W, 3])
        X = tf.cast(X, tf.float32) / 255
        y_hub, _ = model_hub({**states_hub, 'image': X})
        print(y_hub[0][0:5])
        model_local, states_local = create_local(model_id)
        y_local, _ = model_local({**states_local, 'image': X})
        print(y_local[0][0:5])
        tf.debugging.assert_near(y_local, y_hub, atol=1e-3)


if __name__ == '__main__':
    unittest.main()
```
## 4. Expected behavior

The output logits of the hub model and the checkpoint model should be close. However, they differ considerably.

## 5. Additional context

Dependencies for the test:
numpy
Pillow==11.1.0
six==1.17.0
tensorflow[and-cuda]==2.18.1
tensorflow_hub==0.16.1
tf_models_official==2.18.00

## 6. System information

- OS Platform and Distribution - Ubuntu 22.04.5 LTS
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below):=2.18.1
- Python version: 3.10.12
- CUDA/cuDNN version: cuda_12.8.r12.8
- GPU model and memory: NVIDIA GeForce RTX 4090, 24GB

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Movinet hub and source output differ #13559

Prerequisites

1. The entire URL of the file you are using

2. Describe the bug

3. Steps to reproduce

4. Expected behavior

5. Additional context

6. System information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Movinet hub and source output differ #13559

Description

Prerequisites

1. The entire URL of the file you are using

2. Describe the bug

3. Steps to reproduce

4. Expected behavior

5. Additional context

6. System information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions