Description
TensorNets provides a seamless integration with regular TensorFlow APIs. You can define any models under tf.variable_scope
and tf.name_scope
to couple the model with your established scripts. This document shows basic examples for tf.variable_scope
, tf.name_scope
, and weight sharing. First, import two libraries:
import tensorflow as tf
import tensornets as nets
Let's get started with basic TensorFlow APIs. You can manage a prefix of variable names with tf.variable_scope
and tf.name_scope
. The difference is that tf.Variable
will be affected by only tf.name_scope
, while tf.Tensor
by both tf.variable_scope
and tf.name_scope
. Also, the second tf.get_variable('w', [1])
will try to create the same variable if tf.variable_scope(reuse=None)
, or return the pointer of the existing variable otherwise (reuse=True
, reuse=tf.AUTO_REUSE
). Here is an example:
with tf.name_scope('foo'):
with tf.variable_scope('goo'):
with tf.name_scope('hoo'):
# `tf.Variable` will be affected by only `tf.name_scope`.
w = tf.get_variable('w', [1])
assert w.name == 'goo/w:0'
# `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
s = tf.constant(-1.0)
y = s * w
assert s.name == 'foo/goo/hoo/Const:0'
assert y.name == 'foo/goo/hoo/mul:0'
# `tf.get_variable` will try to create the same variable again
# if `tf.variable_scope(reuse=None)` (default).
try:
w2 = tf.get_variable('w', [1])
except ValueError as e:
print(e) # Variable goo/w already exists, disallowed.
The principle is easily extended to TensorNets. The weights returned by get_weights
are tf.Variable
, and the outputs from get_outputs
and get_middles
are tf.Tensor
. Thus, the weights will be affected by only tf.name_scope
, while the outputs and the middles by both tf.variable_scope
and tf.name_scope
. Surely, the model function call can't be performed without reuse=True
or reuse=tf.AUTO_REUSE
because the function will try to create the same variable again.
with tf.name_scope('xoo'):
with tf.variable_scope('yoo'):
with tf.name_scope('zoo'):
# The weights returned by `get_weights` are `tf.Variable`,
# and the outputs from `get_outputs` and `get_middles` are `tf.Tensor`
x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
model1 = nets.ResNet50(x1)
# `tf.Variable` will be affected by only `tf.name_scope`.
assert model1.get_weights()[-1].name == 'yoo/resnet50/logits/biases:0'
# `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
assert model1.name == 'xoo/yoo/zoo/resnet50/probs:0'
assert model1.get_outputs()[-1].name == 'xoo/yoo/zoo/resnet50/probs:0'
assert model1.get_middles()[-1].name == 'xoo/yoo/zoo/resnet50/conv5/block3/out:0'
# `tf.get_variable` will try to create the same variable again
# if `tf.variable_scope(reuse=None)` (default).
try:
x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
model2 = nets.ResNet50(x2)
except ValueError as e:
print(e) # Variable yoo/resnet50/conv1/conv/weights already exists, disallowed.
And we can easily implement the concept of weight sharing by using tf.variable_scope(reuse=tf.AUTO_REUSE)
. An example is as follows:
with tf.variable_scope('boo', reuse=tf.AUTO_REUSE):
w1 = tf.get_variable('w', [1])
w2 = tf.get_variable('w', [1])
assert w1 == w2
assert w1.name == 'boo/w:0'
s = tf.constant(-1.0)
y1 = s * w1
y2 = s * w2
assert y1 != y2
assert y1.name == 'boo/mul:0'
assert y2.name == 'boo/mul_1:0'
TensorNets can be also easily integrated with tf.variable_scope
:
with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
model1 = nets.ResNet50(x1)
model2 = nets.ResNet50(x2)
for (a, b) in zip(model1.get_weights(), model2.get_weights()):
assert a == b
assert model1.get_weights()[-1].name == 'koo/resnet50/logits/biases:0'
assert model1 != model2
assert model1.name == 'koo/resnet50/probs:0'
assert model2.name == 'koo/resnet50_1/probs:0'
Summary
I'd like to say that there are two patterns to implement weight sharing:
- with
tf.variable_scope
:
with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
model1 = nets.ResNet50(x1)
model2 = nets.ResNet50(x2)
- without
variable_scope
:
model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)
- (equivalent to 2):
import functools
resnet = functools.partial(nets.ResNet50, reuse=tf.AUTO_REUSE)
model1 = resnet(x1)
model2 = resnet(x2)
And I recommend the following pattern used in deploying multiple clones in tf.slim
:
with tf.name_scope('clone0'):
model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
with tf.name_scope('clone1'):
model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)
for (a, b) in zip(model1.get_weights(), model2.get_weights()):
assert a == b
assert model1.name == 'clone0/resnet50/probs:0'
assert model2.name == 'clone1/resnet50/probs:0'
Without tf.name_scope
, tf.Tensor
will be automatically named with a postfix-style (resnet50
, resnet50_1
, ...). I think that it may be difficult to manage tensor names in such cases.