Skip to content

Confusing result when I load pretrained weight for mobilenetv3smallmini. #67

Open
@linkrain-a

Description

@linkrain-a

First I loaded pretrained weight with is_training=True, the ouput float is probability of input's class:

root@gpu2:/workspace/lx_code_hub/classification# CUDA_VISIBLE_DEVICES=1 python3 test.py 
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
1.0
2020-06-12 09:50:02.483822: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-12 09:50:02.644052: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x56f6a90 executing computations on platform CUDA. Devices:
2020-06-12 09:50:02.644168: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): Tesla P4, Compute Capability 6.1
2020-06-12 09:50:02.649138: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2099960000 Hz
2020-06-12 09:50:02.651987: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x57e6590 executing computations on platform Host. Devices:
2020-06-12 09:50:02.652247: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2020-06-12 09:50:02.653086: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Tesla P4 major: 6 minor: 1 memoryClockRate(GHz): 1.1135
pciBusID: 0000:82:00.0
totalMemory: 7.43GiB freeMemory: 5.31GiB
2020-06-12 09:50:02.653177: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-06-12 09:50:02.654691: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-12 09:50:02.654734: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2020-06-12 09:50:02.654787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2020-06-12 09:50:02.655474: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5138 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:82:00.0, compute capability: 6.1)
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/losses/losses_impl.py:209: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
epoch 0
2020-06-12 09:50:08.470524: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
0.0007623361
Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7fd7b85f4c18>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 738, in __del__
TypeError: 'NoneType' object is not callable

Then I loaded pretrained weight with is_training=False:

root@gpu2:/workspace/lx_code_hub/classification# CUDA_VISIBLE_DEVICES=1 python3 test.py 
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
1.0
2020-06-12 09:50:32.524288: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-12 09:50:32.671292: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x54be810 executing computations on platform CUDA. Devices:
2020-06-12 09:50:32.671360: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): Tesla P4, Compute Capability 6.1
2020-06-12 09:50:32.676036: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2099960000 Hz
2020-06-12 09:50:32.678501: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55ae310 executing computations on platform Host. Devices:
2020-06-12 09:50:32.678571: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2020-06-12 09:50:32.679732: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: Tesla P4 major: 6 minor: 1 memoryClockRate(GHz): 1.1135
pciBusID: 0000:82:00.0
totalMemory: 7.43GiB freeMemory: 5.31GiB
2020-06-12 09:50:32.679949: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-06-12 09:50:32.681698: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-12 09:50:32.681776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2020-06-12 09:50:32.681803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2020-06-12 09:50:32.682511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5138 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:82:00.0, compute capability: 6.1)
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/losses/losses_impl.py:209: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
epoch 0
2020-06-12 09:50:38.539550: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
0.67551297

Why they are different?

Code:

inputs = tf.placeholder(tf.float32, [None, 224, 224, 3])
outputs = tf.placeholder(tf.float32, [None, 1000])
model = nets.MobileNet100v3smallmini(inputs, is_training=False, classes=1000)
#model = nets.MobileNet100v3smallmini(inputs)
saver = tf.train.Saver(tf.global_variables(), max_to_keep=300)


assert isinstance(model, tf.Tensor)
path = ['cat.jpg'] * 10
label = np.zeros((1000,), dtype=np.float32)
label[283] = 1.
print(label[283])
labels = np.row_stack((label,) * 10)

img = nets.utils.load_img(path, target_size=(224, 224))
assert img.shape == (10, 224, 224, 3)
with tf.Session() as sess:
    #sess.run(tf.global_variables_initializer())
    sess.run(model.pretrained())  # equivalent to nets.pretrained(model)
    #initialize_uninitialized(sess)
    #exit(0)
    #with tf.name_scope('lx_train'):
    loss = tf.losses.softmax_cross_entropy(outputs, model.logits)
    train = tf.train.AdamOptimizer(learning_rate=1e-5).minimize(loss)
    #var_list = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='lx_train')
    #print(var_list)
    #exit(0)
    #sess.run(tf.variables_initializer(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='lx_train')))
    for i in range(50):
        print('epoch {}'.format(i))
        img = model.preprocess(img)  # equivalent to img = nets.preprocess(model, img)
        #preds = sess.run(model, {inputs: img})
        #print(preds[0][283])
        #exit(0)
        if i % 5 == 0:
            saver.save(sess, save_path='ckpt/model.ckpt', global_step=i)
        preds = sess.run(model, {inputs: img})
        #_, total_loss, preds = sess.run([train, loss, model], {inputs: img, outputs: labels})
        if i % 5 == 0:
            print(preds[0][283])
            exit(0)
            #print(total_loss, preds[0][283])
#print(nets.utils.decode_predictions(preds, top=2))

Thank you for helping me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions