Notes for variable_scope, name_scope, and weight sharing

**TensorNets** provides a seamless integration with regular TensorFlow APIs. You can define any models under `tf.variable_scope` and `tf.name_scope` to couple the model with your established scripts. This document shows basic examples for `tf.variable_scope`, `tf.name_scope`, and ***weight sharing***. First, import two libraries:
```python
import tensorflow as tf
import tensornets as nets
```

Let's get started with basic TensorFlow APIs. You can manage a prefix of variable names with `tf.variable_scope` and `tf.name_scope`. The difference is that `tf.Variable` will be affected by only `tf.name_scope`, while `tf.Tensor` by both `tf.variable_scope` and `tf.name_scope`. Also, the second `tf.get_variable('w', [1])` will try to create the same variable if `tf.variable_scope(reuse=None)`, or return the pointer of the existing variable otherwise (`reuse=True`, `reuse=tf.AUTO_REUSE`). Here is an example:
```python
with tf.name_scope('foo'):
  with tf.variable_scope('goo'):
    with tf.name_scope('hoo'):

      # `tf.Variable` will be affected by only `tf.name_scope`.
      w = tf.get_variable('w', [1])
      assert w.name == 'goo/w:0'

      # `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
      s = tf.constant(-1.0)
      y = s * w
      assert s.name == 'foo/goo/hoo/Const:0'
      assert y.name == 'foo/goo/hoo/mul:0'

      # `tf.get_variable` will try to create the same variable again
      # if `tf.variable_scope(reuse=None)` (default).
      try:
        w2 = tf.get_variable('w', [1])
      except ValueError as e:
        print(e)  # Variable goo/w already exists, disallowed.
```

The principle is easily extended to TensorNets. The weights returned by `get_weights` are `tf.Variable`, and the outputs from `get_outputs` and `get_middles` are `tf.Tensor`. Thus, the weights will be affected by only `tf.name_scope`, while the outputs and the middles by both `tf.variable_scope` and `tf.name_scope`. Surely, the model function call can't be performed without `reuse=True` or `reuse=tf.AUTO_REUSE` because the function will try to create the same variable again.
```python
with tf.name_scope('xoo'):
  with tf.variable_scope('yoo'):
    with tf.name_scope('zoo'):

      # The weights returned by `get_weights` are `tf.Variable`,
      # and the outputs from `get_outputs` and `get_middles` are `tf.Tensor`
      x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
      model1 = nets.ResNet50(x1)

      # `tf.Variable` will be affected by only `tf.name_scope`.
      assert model1.get_weights()[-1].name == 'yoo/resnet50/logits/biases:0'

      # `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
      assert model1.name == 'xoo/yoo/zoo/resnet50/probs:0'
      assert model1.get_outputs()[-1].name == 'xoo/yoo/zoo/resnet50/probs:0'
      assert model1.get_middles()[-1].name == 'xoo/yoo/zoo/resnet50/conv5/block3/out:0'

      # `tf.get_variable` will try to create the same variable again
      # if `tf.variable_scope(reuse=None)` (default).
      try:
        x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
        model2 = nets.ResNet50(x2)
      except ValueError as e:
        print(e)  # Variable yoo/resnet50/conv1/conv/weights already exists, disallowed.
```

And we can easily implement the concept of weight sharing by using `tf.variable_scope(reuse=tf.AUTO_REUSE)`. An example is as follows:
```python
with tf.variable_scope('boo', reuse=tf.AUTO_REUSE):
  w1 = tf.get_variable('w', [1])
  w2 = tf.get_variable('w', [1])
  assert w1 == w2
  assert w1.name == 'boo/w:0'
  s = tf.constant(-1.0)
  y1 = s * w1
  y2 = s * w2
  assert y1 != y2
  assert y1.name == 'boo/mul:0'
  assert y2.name == 'boo/mul_1:0'
```

TensorNets can be also easily integrated with `tf.variable_scope`:
```python
with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
  x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
  x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
  model1 = nets.ResNet50(x1)
  model2 = nets.ResNet50(x2)
  for (a, b) in zip(model1.get_weights(), model2.get_weights()):
    assert a == b
  assert model1.get_weights()[-1].name == 'koo/resnet50/logits/biases:0'
  assert model1 != model2
  assert model1.name == 'koo/resnet50/probs:0'
  assert model2.name == 'koo/resnet50_1/probs:0'
```

## Summary

I'd like to say that there are two patterns to implement weight sharing:

1. with `tf.variable_scope`:
```python
with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
    model1 = nets.ResNet50(x1)
    model2 = nets.ResNet50(x2)
```

2. without `variable_scope`:
```python
model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)
```

3. (equivalent to 2):
```python
import functools
resnet = functools.partial(nets.ResNet50, reuse=tf.AUTO_REUSE)
model1 = resnet(x1)
model2 = resnet(x2)
```

And ***I recommend the following pattern*** used in deploying multiple clones in `tf.slim`:
```python
with tf.name_scope('clone0'):
    model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
with tf.name_scope('clone1'):
    model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)

for (a, b) in zip(model1.get_weights(), model2.get_weights()):
    assert a == b

assert model1.name == 'clone0/resnet50/probs:0'
assert model2.name == 'clone1/resnet50/probs:0'
```

Without `tf.name_scope`, `tf.Tensor` will be automatically named with a postfix-style (`resnet50`, `resnet50_1`, ...). I think that it may be difficult to manage tensor names in such cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notes for variable_scope, name_scope, and weight sharing #43

Summary

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Notes for variable_scope, name_scope, and weight sharing #43

Description

Summary

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions